Massive RT-PCR Primer Designer is online

The Massive RT-PCR Primer Designer is based on Primer3 and BioPerl toolbox, and was derived from the standalone codes which have been discussed by two previous blog posts (Post 1 and Post 2).

rt-primer-designer

You may access the service at The Massive RT-PCR Primer Designer.

What this tool needs is your fasta file, which contains one or more nucleotide sequences that will be use as PCR template. Upload this file and the tool will generate the most ideal primer pairs suitable for quantitive Real-Time PCR (qRT-PCR). The primers will be supplied as a downloadable Excel file.

Notably, the amplicon size of these RT-PCR primers are spanning from 150 bp to 250 bp.

Powered by AliYun, the service is available at 7×24h.

By the way, apply this promotion code u6fovy when you buy Aliyun service for a 10% OFF.

ContigExpress-Assemble pair-end sequencing fragments

送样测序的时候,测序公司会问你是否要从两端测通?如果你选择两端测通,可能会得到两个序列(也可能测序公司已经帮你拼接成一条序列了),这时候你需要将这两条序列拼接起来。

我们使用ContigExpress这个程序来完成这一任务。

ContigExpress是Vector NTI© Software中的一个组件,可以在这里下载:http://www.thermofisher.com/cn/zh/home/life-science/cloning/vector-nti-software.html

contigexpress1

contigexpress2

contigexpress3

contigexpress4

contigexpress5

contigexpress6

未尽事宜,请查阅 Vector NTI Advance®11.5_Users_Manual 下列内容 How Do I : Assemble Chromatogram Data Using ContigExpress:

contigexpress7

A Genbank to BED converter

Like the previous Genbank -> GTF converter, this script is also depended on BioPerl, so you need firstly get the module installed in your system (Install BioPerl). To check whether BioPerl is ready, just refer to this post: bio-spring.info/check-bioperl/ .

SYNOPSIS


gb2bed.pl [options] filename

Options:
--help -h display this message
--input -i NCBI GenBank file
--out -o filename of bed output. [STDOUT]
--filter -f GenBank primary tag that will be filtered, may be set to 'gene', 'CDS', 'rRNA', etc.
--keep -k GenBank primary tag will be keep. Default is CDS. [CDS]

Examples:

perl gb2bed.pl -i seq.gb -o seq.bed
perl gb2bed.pl -i seq.gb -k CDS -o seq.bed
perl gb2bed.pl seq.gb > seq.bed

Read Pod for further information.


#!/usr/bin/perl

=pod

=head1 NAME

gb2bed.pl -- Genbank -> BED

=head1 SYNOPSIS

gb2bed.pl [options] filename

Options:
--help -h display this message
--input -i NCBI GenBank file
--out -o filename of bed output. [STDOUT]
--filter -f GenBank primary tag that will be filtered, may be set to 'gene', 'CDS', 'rRNA', etc.
--keep -k GenBank primary tag will be keep. Default is CDS. [CDS]

Examples:

perl gb2bed.pl -i seq.gb -o seq.bed
perl gb2bed.pl -i seq.gb -k CDS -o seq.bed
perl gb2bed.pl seq.gb > seq.bed

=head1 DESCRIPTION

Use this program to generate bioconductor friendly BED files from NCBI GenBank.
Six columns will be extracted:
1. chrom - name of the chromosome or scaffold. Any valid seq_region_name can be used, and chromosome names can be given with or without the 'chr' prefix.
2. chromStart - Start position of the feature in standard chromosomal coordinates (i.e. first base is 0).
3. chromEnd - End position of the feature in standard chromosomal coordinates
4. name - Label to be displayed under the feature, if turned on in "Configure this page". [locus_tag]
5. score - A score between 0 and 1000. [0]
6. strand - defined as + (forward) or - (reverse).

=head1 AUTHOR

Chun-Hui, Gao (gaoch@thelifesciencecentury.com)
Copyright (c) 2015 www.thelifesciencecentury.com.

=cut

use strict;
use Data::Dumper;
use Pod::Usage;
use Bio::SeqIO;
use Getopt::Long;

my ($help, $genbank_input, $output, @filters, @keep);

my $ok= GetOptions( 'help|h' => \$help,
'input|i=s' => \$genbank_input,
'filter|f=s' => \@filters,
'keep|k=s' => \@keep,
'out|o=s' => \$output );
pod2usage(2) if $help || !$ok;

$genbank_input = shift @ARGV unless ($genbank_input );

@keep = ('gene') unless $#keep >= 0;

open *IN, "< $genbank_input " or die pod2usage(2); my $out; if ($output) { open $out, "> $output" or die "Can't open file $output:$@\n";
}
else {
$out = *STDOUT;
}

my %filter;
map {$filter{$_}++} @filters;
my %keep;
map {$keep{$_}++} @keep;
my $in = Bio::SeqIO->new(-fh => \*IN, -format => "genbank");
while ( my $seq = $in->next_seq() ) {
my $chr = $seq->accession_number;

my @all_SeqFeatures = $seq->get_all_SeqFeatures;

#~ warn "# working on $chr\n";
if ($#filters >= 0){
@all_SeqFeatures = grep { !defined $filter{$_->primary_tag} } @all_SeqFeatures;
}
if ($#keep >= 0) {
@all_SeqFeatures = grep { defined $keep{$_->primary_tag} } @all_SeqFeatures;
}

# abort if there are no features
warn "$chr has no features, skipping\n" and next
if $#all_SeqFeatures < 0; for my $feature ( @all_SeqFeatures ) { my ($start, $end, $name, $strand); $start = $feature->start - 1; # bed base is 0, while genbank is 1
$end = $feature->end;
($name) = $feature->get_tag_values('locus_tag');
$strand = $feature->strand;
$strand = $strand < 0 ? '-' : '+'; print $out join("\t", $chr, $start, $end, $name, 0, $strand), "\n"; } }

A Genbank –> GTF converter derived from BioPerl

BioPerl has a Genbank –> GFF converter (the script is bp_genbank2gff3.pl), however, this converter generated GFF/GTF file could not work in Bioconductor when using the function maketxdbfromGFF to construct a transcript database.

Currently, I have developed such a converter that may extract information from a NCBI Genbank file and write them to GFF version 2, or GTF format. This script is derived from BioPerl, so you need firstly get the module installed in your system. To check whether BioPerl is ready, just refer to this post: bio-spring.info/check-bioperl/ .

Here is the synopsis of this script.

SYNOPSIS


NCBI_genbank2gtf.pl [options] filename

Options:
--help -h display this message
--input -i NCBI GenBank file
--out -o filename of GTF output

Examples:

perl NCBI_genbank2gtf.pl -i seq.gb -o seq.gtf
perl NCBI_genbank2gtf.pl seq.gb > seq.gtf

The codes is as follows:

#!/usr/bin/perl

=pod

=head1 NAME

NCBI_genbank2gtf.pl -- Genbank -> Bioconductor friendly GTF

=head1 SYNOPSIS

NCBI_genbank2gtf.pl [options] filename

Options:
--help -h display this message
--input -i NCBI GenBank file
--out -o filename of GTF output

Examples:

perl NCBI_genbank2gtf.pl -i seq.gb -o seq.gtf
perl NCBI_genbank2gtf.pl seq.gb > seq.gtf

=head1 DESCRIPTION

Use this program to generate bioconductor friendly GTF files from NCBI GenBank.
The file may be used in *maketxdbfromGFF* function and should work well.

=head1 AUTHOR

Chun-Hui, Gao (gaoch@thelifesciencecentury.com)

Copyright (c) 2015 www.thelifesciencecentury.com.

=cut

use strict;
use Data::Dumper;
use Pod::Usage;
use Bio::SeqIO;
use Getopt::Long;

my ($help, $genbank_input, $gtf_output);

my $ok= GetOptions( 'help|h' => \$help,
'input|i=s' => \$genbank_input,
'out|o=s' => \$gtf_output );
pod2usage(2) if $help || !$ok;

$genbank_input = shift @ARGV unless ($genbank_input );

open *IN, "< $genbank_input " or die pod2usage(2); my $out; if ($gtf_output) { open $out, "> $gtf_output" or die "Can't open file $gtf_output:$@\n";
}
else {
$out = *STDOUT;
}

my %seen;
my $in = Bio::SeqIO->new(-fh => \*IN, -format => "genbank");
while ( my $seq = $in->next_seq() ) {
my $seq_acc = $seq->accession_number;

# abort if there are no features
warn "$seq_acc has no features, skipping\n" and next
if !$seq->all_SeqFeatures;

# construct a GFF header
# add: get source_type from attributes of source feature? chromosome=X tag
# also combine 1st ft line here with source ft from $seq ..
my $header = gff_header($seq);
print $out $header;
warn "# working on $seq_acc\n";

for my $feature ( $seq->get_all_SeqFeatures ) {

my $method = $feature->primary_tag;
$seen{$method} ++;
$feature = maptags2gff($feature);
print $out $feature->gff_string(), "\n";
if ($method eq 'CDS'){
$feature->primary_tag("exon");
$feature->add_tag_value("exon_number", 1);
$feature->add_tag_value("exon_id", $feature->get_tag_values("transcript_id"));
$feature = maptags2gff($feature);
print $out $feature->gff_string(), "\n";
}

}

}

warn "# Count of features in $genbank_input:\n";
map {warn sprintf("# %20s:%6d\n", $_, $seen{$_});} keys %seen;

sub gff_header {
my ($seq) = @_;
my $head = sprintf("##gff-version 2\n# sequence-region: %s (1..%d)\n", $seq->accession_number, $seq->length);
$head .= sprintf( "# %s\n", $seq->desc);
$head .= "# converted by NCBI_genbank2gtf.pl\n";

#~ print Dumper $acc, $desc, $end;
return $head;

}

sub maptags2gff {
my $f = shift;
my $method = $f->primary_tag;
my %TAG_MAP;
if ($method eq 'gene' || $method eq 'misc_RNA' || $method eq 'rRNA' || $method eq 'tRNA'){
$f->add_tag_value('transcript_id',$f->get_tag_values('locus_tag'));
%TAG_MAP = ( gene => 'gene_name',
transcript_id => 'transcript_id',
locus_tag => 'gene_id' );
}
elsif ($method eq 'CDS') {
$f->add_tag_value('transcript_id',$f->get_tag_values('locus_tag'));
%TAG_MAP = ( locus_tag => 'gene_id',
protein_id => 'protein_id',
transcript_id => 'transcript_id',
gene => 'gene_name',
product => 'product');

}
elsif ($method eq 'exon') {

%TAG_MAP = ( gene_id => 'gene_id',
protein_id => 'protein_id',
transcript_id => 'transcript_id',
exon_number => 'exon_number',
exon_id => 'exon_id',
gene_name => 'gene_name');

}
elsif ($method eq 'source') {
%TAG_MAP = ( organism => 'organism' );

}
else {

}

my @all_tags = $f->get_all_tags();
#~ print Dumper \@all_tags;
for my $tag (@all_tags){

if (exists $TAG_MAP{$tag}){
my $newtag= $TAG_MAP{$tag};
my @v= $f->get_tag_values($tag);
$f->remove_tag($tag);
$f->add_tag_value($newtag,@v);
}
else {
$f->remove_tag($tag);
}
}

return $f;
}

If you would like to access the full document, run:
perldoc NCBI_genbank2gtf.pl

Feedback is welcome.

Here is also a GenBank -> BED converter:

cloudfs无法上传的故障排除

之前设置WordPress使用阿里云OSS存储图片,今天出问题了。设置方法在这里

错误提示:cannot access oss: Transport endpoint is not connected

解决方案如下:

  1. 去除挂载
  2. 重启Cloudfs服务
  3. 设置权限

[cc lang=”bash”]
umount /mnt/oss
service cloudfs restart
chmod 666 /mnt/oss
[/cc]