pybedtools主要是使用BedTool对所有参考格式进行读取, 不但能够读取bed,gff, gtf,还可以读取gz等格式。
from pybedtools import BedTool
snps = BedTool('snps.bed.gz') # [1]
genes = BedTool('hg19.gff') # [1]
Wrapper around Aaron Quinlan's?BEDtools?suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.
fn?is typically the name of a BED-like file, but can also be one of the following:
a string filename
another BedTool object
an iterable of Interval objects
an open file object
a "file contents" string (see below)
If?from_string?is True, then you can pass a string that contains the contents of the BedTool you want to create. This will treat all spaces as TABs and write to tempfile, treating whatever you pass as?fn?as the contents of the bed file. This also strips empty lines.
Typical usage is to point to an existing file:
But you can also create one from scratch from a string:
>>> s='''... chrX? 1? 100... chrX 25? 800... '''>>> a=BedTool(s,from_string=True)
Or use examples that come with pybedtools:
>>> example_files=pybedtools.list_example_files()>>> assert'a.bed'inexample_files>>> a=pybedtools.example_bedtool('a.bed')
>>> c=a_with_b.saveas('intersection-of-a-and-b.bed',trackline='track name="a and b"')
>>> print(c.fn)
>>> # opening the underlying file shows the track line
>>> print(open(c.fn).read())
track name="a and b
"chr1? ? ? ? 155? ? 200? ? feature2? ? ? ? 0? ? ? +chr1? ? ? ? 155? ? 200? ? feature3? ? ? ? 0? ? ? -chr1? ? ? ? 900? ? 901? ? feature4? ? ? ? 0? ? ? +
>>> # printing file-based BedTool objects will not print the track line
>>> print(c)
chr1? ? ? ? 155? ? 200? ? feature2? ? ? ? 0? ? ? +chr1? ? ? ? 155? ? 200? ? feature3? ? ? ? 0? ? ? -chr1? ? ? ? 900? ? 901? ? feature4? ? ? ? 0? ? ? +
Make a copy of the BedTool.
Optionally adds?trackline?to the beginning of the file.
Optionally compresses output using gzip.
if the filename extension is .gz, or compressed=True, the output is compressed using gzip
Returns a new BedTool for the newly saved file.
A newline is automatically added to the trackline if it does not already have one.
Example usage:
>>> a=pybedtools.example_bedtool('a.bed')>>> b=a.saveas('other.bed')>>> b.fn'other.bed'>>> print(b==a)True
>>> b=a.saveas('other.bed',trackline="name='test run' color=0,55,0")>>> open(b.fn).readline()"name='test run' color=0,55,0\n">>> ifos.path.exists('other.bed'):... os.unlink('other.bed')
另外,如果你不想要加入一个track line,你也可以使用?BedTool.moveto() ,这个方法比较快,比较适合大文件。这个命名是重命名,而不是进行复制,也就意味着,如果试图使用原来的文件,就不会奏效,因为那个文件已经补存在了。
>>> d=a_with_b.moveto('another_location.bed')
Move to a new filename (can be much quicker than BedTool.saveas())
Move BED file to new filename,?fn.
Returns a new BedTool for the new file.
Example usage:
>>> # make a copy so we don't mess up the example file>>> a=pybedtools.example_bedtool('a.bed').saveas()>>> a_contents=str(a)>>> b=a.moveto('other.bed')>>> b.fn'other.bed'>>> b==a_contentsTrue