mardi 20 juin 2017
Bacmet database
http://bacmet.biomedicine.gu.se/browse_by_compounds_get_info.pl?compound=Zinc%20(Zn)
vendredi 2 septembre 2016
Enterobase, a genome of enteric bacteria genome (Warwick)
EnteroBase: A Powerful, User-Friendly Online Resource for Analyzing and Visualizing Genomic Variation within Enteric Bacteria
Getting Started
Using the website
Using the API
About the underlying Pipelines, EnteroTools
Objectives
EnteroBase aims to establish a world-class, one-stop, user-friendly, backwards-compatible but forward-looking genome database, EnteroBase—together with a set of web-based tools, EnteroTools—to enable bacteriologists to identify, analyse, quantify and visualize genomic variation principally within the genera:
Escherichia
Salmonella
the Yersiniae
Moraxella
EnteroBase is populated with over 100,000 of genomic assemblies derived from publicly available complete genomes, sequence read archives and user uploads.
Funded by BBSRC research grant BB/L020319/1.
Implementation
EnteroBase is strain based. Each strain is associated with metadata and genomic assemblies, as well as with deduced genotyping data. All assemblies are performed de novo from Illumina/PacBio reads using a standardised, versioned pipeline. Unless explicitly chosen, only assemblies that match pre-defined criteria are displayed, and where multiple assemblies are associated with a strain, only the best assembly according to assembly criteria is displayed. Similarly, assemblies whose genotypes conflict with their metadata are also not displayed by default.
Genotyping data is deduced exclusively from assemblies. MLST data is called by uBlast against a dataset of allelic sequences that differ from each other by at least 2.5%. Other genotyping methods are under development. Genotyping data is summarised in the Experimental Data pane. The full data including assemblies can be downloaded freely (but see Fair Usage)
Fair Usage
All metadata, assemblies and genotyping data can be freely downloaded for academic purposes. In order to allow users who upload unpublished data sufficient time to perform their own analyses, we request that no analyses of user data be published without their explicit permission prior to the release date. Both metadata and genomic data will be clearly marked if it is downloaded prior to the release date. We would also consider it fair usage that users who wish to analyse very large amounts of the data stored in EnteroBase also contribute software tools to EnteroBase that facilitate the presentation and analysis of their results. Downloading and analyses of data by commercial enterprises can only be performed after explicit permission by the administrators, which may involve legal agreements regarding material transfer.
Data Privacy
EnteroBase users are encouraged to upload their own reads to the website, which will be assembled and genotyped like existing public data. Submitters should note that raw data (sequence reads) will never be made public through the website to other users. The genome assembly will only be accessible to the data submitter and their buddies for 6 months after uploads. Assembly data will then be made public, longer release dates can be negotiated by contacting Martin Sergeant on M.J.Sergeant@warwick.ac.uk. Genotyping results i.e. MLST, ribosomal MLST, core genome MLST, in silico serotyping, will be made public as soon as the uploaded data has been processed. User passwords on the website are encrypted and no one, including administrators, can easily access them. However, we would advise you NOT to use the same password you would use for important accounts, such as internet banking.
Citation
EnteroBase has not been formally published, yet. If you use data/metadata from the website, or the analysis based on these data, please cite EnteroBase website directly: http://enterobase.warwick.ac.uk
An extend citation could be:
EnteroBase. [online] Enterobase.warwick.ac.uk. Available at: http://enterobase.warwick.ac.uk [Accessed 1 January 2016].
3rd Party acknowledgements
If you use data generated by 3rd party tools in EnteroBase, please cite both EnteroBase and the paper describing the specific tool.
rMLST is Copyright © 2010-2016, University of Oxford. rMLST is described in: Jolley et al. 2012 Microbiology 158:1005-15
Serovar predictions (SISTR) have been calculated using the pipeline developed by the SISTR team and is described in Yoshida et al. 2016 PLoS ONE 11(1): e0147101
How to use this wiki
You can browse topics through the links at the top
You can click the HELP button on every EnteroBase webpage to see the documentation about that page
You can click the information ( i ) icons on column headers to get information about that column.
https://enterobase.warwick.ac.uk/
mardi 30 août 2016
PASIFIC, on online tool to predict regulatory, premature transcription termination in bacteria
From the website
A common strategy for regulation of gene expression in bacteria is conditional transcription termination. This strategy is frequently employed by 5′UTR cis-acting RNA elements (riboregulators), including riboswitches and attenuators. Such riboregulators can assume two mutually exclusive RNA structures, one of which forms a transcriptional terminator and results in premature termination, and the other forms an antiterminator that allows read-through into the coding sequence to produce a full-length mRNA. We developed a machine-learning based approach, which, given a 5′UTR of a gene, predicts whether it can form the two alternative structures typical to riboregulators employing conditional termination. Using a large positive training set of riboregulators derived from 89 human microbiome bacteria, we show high specificity and sensitivity for our classifier. We further show that our approach allows the discovery of previously unidentified riboregulators, as exemplified by the detection of new LeuA leaders and T-boxes in Streptococci. Finally, we developed PASIFIC (www.weizmann.ac.il/molgen/sorek/PASIFIC), an online web-server that, given a user-provided 5′UTR sequence, predicts whether this sequence can adopt two alternative structures conforming with the conditional termination paradigm. This webserver is expected to assist in the identification of new riboswitches and attenuators in the bacterial pan-genome.
jeudi 6 septembre 2012
CRISPRs: The CRISPR database
From the website
This site acts as a gateway to publicly accessible CRISPRs database and software. It enables the easy detection of CRISPRs in locally-produced data and consultation of CRISPRs present in the database. It also gives information on the presence of CRISPR-associated (cas) genes when they have been annotated as such.
This web site is the product of an original work by Ibtissem Grissa (PhD thesis Paris University) and is presently developed by Christine Drevet.
Weblink
CRISPRs
vendredi 16 mars 2012
A database for bacterial group II introns
(from the paper)
The Database for Bacterial Group II Introns provides a catalogue of full-length, non-redundant group II introns present in bacterial DNA sequences in GenBank. The website is divided into three sections. The first section provides general information on group II intron properties, structures and classification. The second and main section lists information for individual introns, including insertion sites, DNA sequences, intron-encoded protein sequences and RNA secondary structure models. The final section provides tools for identification and analysis of intron sequences. These include a step-by-step guide to identify introns in genomic sequences, a local BLAST tool to identify closest intron relatives to a query sequence, and a boundary-finding tool that predicts 5' and 3' intron-exon junctions in an input DNA sequence. Finally, selected intron data can be downloaded in FASTA format.
http://webapps2.ucalgary.ca/~groupii/index.html#
The Database for Bacterial Group II Introns provides a catalogue of full-length, non-redundant group II introns present in bacterial DNA sequences in GenBank. The website is divided into three sections. The first section provides general information on group II intron properties, structures and classification. The second and main section lists information for individual introns, including insertion sites, DNA sequences, intron-encoded protein sequences and RNA secondary structure models. The final section provides tools for identification and analysis of intron sequences. These include a step-by-step guide to identify introns in genomic sequences, a local BLAST tool to identify closest intron relatives to a query sequence, and a boundary-finding tool that predicts 5' and 3' intron-exon junctions in an input DNA sequence. Finally, selected intron data can be downloaded in FASTA format.
http://webapps2.ucalgary.ca/~groupii/index.html#
lundi 16 janvier 2012
The ISCR elements database
(from the website)
ISCR (Insertion Sequence Common Region) elements are Insertion sequences that have similarities to the IS91 family in both structure and function. These two insertion sequence families have several important features that are unique among IS elements: their terminal sequences are unrelated to each other instead of being inverted repeats; their transposases lack the normal DDE amino acid motif found in the majority of IS elements transposases; they do not generate directly repeated sequence on insertion. The IS91 family consists of three members IS91, IS801 and IS1294 and the ISCR family currently has 19 members ISCR1-19. Analysis of the genetic loci of the various ISCR elements has revealed that the vast majority of these elements are found in close association with antimicrobial resistance genes that are not the normal complement of the host genome. They are thus implicated in the acquisition of these genes and appear to have a specialisation for resistance gene transposition.
http://medicine.cf.ac.uk/en/research/research-groups/i3/research/antibacterial-agents/iscr-elements/
ISCR (Insertion Sequence Common Region) elements are Insertion sequences that have similarities to the IS91 family in both structure and function. These two insertion sequence families have several important features that are unique among IS elements: their terminal sequences are unrelated to each other instead of being inverted repeats; their transposases lack the normal DDE amino acid motif found in the majority of IS elements transposases; they do not generate directly repeated sequence on insertion. The IS91 family consists of three members IS91, IS801 and IS1294 and the ISCR family currently has 19 members ISCR1-19. Analysis of the genetic loci of the various ISCR elements has revealed that the vast majority of these elements are found in close association with antimicrobial resistance genes that are not the normal complement of the host genome. They are thus implicated in the acquisition of these genes and appear to have a specialisation for resistance gene transposition.
http://medicine.cf.ac.uk/en/research/research-groups/i3/research/antibacterial-agents/iscr-elements/
Inscription à :
Articles (Atom)