<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GenomeQuest Industry &#187; Message from Technology Team</title>
	<atom:link href="http://blog.genomequest.com/category/technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.genomequest.com</link>
	<description>Conversations on the convergence of SDM, cloud computing, and applications to personalized medicine</description>
	<lastBuildDate>Thu, 12 Jan 2012 23:33:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How fast is your read mapping algorithm?</title>
		<link>http://blog.genomequest.com/2011/06/how-fast-is-your-read-mapping-algorithm/</link>
		<comments>http://blog.genomequest.com/2011/06/how-fast-is-your-read-mapping-algorithm/#comments</comments>
		<pubDate>Tue, 14 Jun 2011 07:25:38 +0000</pubDate>
		<dc:creator>Henk Heus</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[Bioinformatics]]></category>
		<category><![CDATA[Bowtie]]></category>
		<category><![CDATA[BWA]]></category>
		<category><![CDATA[GASSST]]></category>
		<category><![CDATA[GenomeQuest]]></category>
		<category><![CDATA[GenomeQuest Engine]]></category>
		<category><![CDATA[GQ-Engine]]></category>
		<category><![CDATA[Next Generation Sequencing]]></category>
		<category><![CDATA[Sequence Data Management]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=379</guid>
		<description><![CDATA[
This is a question that is often asked when I demo the GenomeQuest platform to potential customers. I always answer that question in three phases.
The first phase goes like this: &#8220;It&#8217;s really fast, it&#8217;s certainly not any slower than BWA/Bowtie or anything else out there.&#8221;. Next question is always: &#8220;Well, do you have any benchmarks?&#8221;.
Which [...]]]></description>
			<content:encoded><![CDATA[<div>
<p><!-- p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 19.0px; font: 13.0px Georgia} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 19.0px; font: 13.0px Georgia; min-height: 15.0px} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 19.0px; font: 13.0px Georgia} ul.ul1 {list-style-type: square} -->This is a question that is often asked when I demo the GenomeQuest platform to potential customers. I always answer that question in three phases.</p>
<p>The first phase goes like this: &#8220;It&#8217;s really fast, it&#8217;s certainly not any slower than BWA/Bowtie or anything else out there.&#8221;. Next question is always: &#8220;Well, do you have any benchmarks?&#8221;.</p>
<p>Which nicely transitions me into the second phase of the answer. This phase is much more rigorous and usually starts with: &#8220;Well, it depends. Let me try to explain.&#8221;:</p>
<ul>
<li>Any good computer scientist can write a mapping algorithm that is really fast. However, that doesn&#8217;t mean anything unless it produces the kind of results that are needed. The NGS application you&#8217;re working with matters here. For example, finding genetic variation in human disease requires very accurate mapping with mismatches and indels. In contrast, with digital gene expression in maize you can cut a lot of corners. You just need to have a rough idea of the number of alignments on a transcript.</li>
<li>Then there is the matter of the data and the technology that produced it. Do you have long reads (more than 120 bp)? Will your mapper handle them? Will it also increase the number of mismatches / indels it can use to align a read? Will it significantly slow down execution, or eat up all your memory when reads get longer? Does this mapper also support local alignments when you need them? Will it align in colorspace? In paired end mode? I could go on.</li>
<li>Next there is the matter of connecting results to the following step in your pipeline. How long does it take to de-duplicate a  1TB alignment file in SAM/BAM format? Or to find those alignments who&#8217;s position overlaps with your exome capture experiment, or all dbSNP entries? At GenomeQuest we have a very efficient way of storing and handling alignments (including sequences/annotation). This saves real time, especially when compared to the alignment step itself (it saves a lot of disk space as well by the way).</li>
</ul>
<p>By the time we get to the third phase of the answer I&#8217;m usually much more confident: &#8220;Well, how fast do you need our mapping algorithm to be?&#8221;.</p>
<ul>
<li>Does it really matter how fast the read mapper is, as long as it&#8217;s comparable in performance to other algorithms for most common use cases? Does it matter if you have the alignments in 2.5 hours instead of 3? Maybe if you analyze thousands of samples per week it matters, but then other things like reliability and professional software support should matter as well.</li>
<li>Do these other algorithms scale with the hardware you throw at them? How easy is it to run a read mapping on 64 compute nodes, with 2 CPUs, 8 cores per CPU per node? What about if you double the amount of hardware? Will you go twice as fast? With the GQ- Engine  you will. Want to run on 1024 nodes? That&#8217;s possible.</li>
<li>Are you asking about speed for a single run, or the throughput for a bunch of runs? Last weekend I ran 2000 NGS read databases though our read mapping workflow (low coverage genome sequencing, about 80M reads per database). I started them on Friday afternoon, went for drinks with my friends that evening, had a nice family dinner on Saturday afternoon, and watched a movie with my kid afterwards. The runs were finished before I woke up on Sunday. No hiccups, no failed runs, no logs to monitor, and &#8211; best of all &#8211; no &#8220;one million-file&#8221; directories to organize. There were a lot of other customers on the system that weekend, doing their NGS analysis as well.</li>
</ul>
<p>If we ever meet for a demo, please ask me this question. I love to talk about it.</p>
<p>Henk Heus, Ph.D.<br />
VP Product Management &amp; Services<br />
GenomeQuest Inc.</p>
<p>At GenomeQuest we use an extended version of the GASSST read mapping algorithm (among others). Read about it here in Bioinformatics here: <a title="http://bioinformatics.oxfordjournals.org/content/26/20/2534.abstract" href="http://bioinformatics.oxfordjournals.org/content/26/20/2534.abstract" target="_self">http://bioinformatics.oxfordjournals.org/content/26/20/2534.abstract</a></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2011/06/how-fast-is-your-read-mapping-algorithm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Upcoming Improvements to the GenomeQuest Engine</title>
		<link>http://blog.genomequest.com/2011/06/upcoming-improvements-to-the-genomequest-engine/</link>
		<comments>http://blog.genomequest.com/2011/06/upcoming-improvements-to-the-genomequest-engine/#comments</comments>
		<pubDate>Mon, 13 Jun 2011 21:27:51 +0000</pubDate>
		<dc:creator>Henk Heus</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[GenomeQuest]]></category>
		<category><![CDATA[GenomeQuest 7.1]]></category>
		<category><![CDATA[GenomeQuest Engine]]></category>
		<category><![CDATA[global alignment]]></category>
		<category><![CDATA[GQ-Engine]]></category>
		<category><![CDATA[interval indexing]]></category>
		<category><![CDATA[local alignment]]></category>
		<category><![CDATA[paired end reads]]></category>
		<category><![CDATA[Product update]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=365</guid>
		<description><![CDATA[As the product manager at GenomeQuest, I&#8217;m very excited to tell you about a couple of really great new features in the GQ-Engine. Features that add to the growing library of high quality NGS components available to GQ platform developers and end users.
Fast local alignments of NGS reads
NGS read mappers typically align reads by trying to [...]]]></description>
			<content:encoded><![CDATA[<p><!-- p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica; color: #540703} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica; color: #540703; min-height: 14.0px} p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Helvetica; color: #540703} -->As the product manager at GenomeQuest, I&#8217;m very excited to tell you about a couple of really great new features in the GQ-Engine. Features that add to the growing library of high quality NGS components available to GQ platform developers and end users.</p>
<p><strong>Fast local alignments of NGS reads</strong></p>
<p>NGS read mappers typically align reads by trying to fit the entire read into the reference sequence. This is referred to as a global alignment, or best fit, strategy. While this works great for short genomic reads, it is not always the best possible solution for longer reads. When a read gets longer the chance of it matching the reference sequence over its entire length decreases.</p>
<p>The shortcomings of global alignment algorithms become readily apparent in RNA-seq studies where a single read can span multiple exons. These exons can be right next to each other in the mRNA, but separated by megabases of intronic sequence on the genome. The only way to align such reads is to use a local alignment strategy that can map different parts of a read to different positions on the reference sequence.</p>
<p>We have added local alignment capabilities to our GASSST read mapper while keeping the existing speed and scaling. This allows us to analyze NGS-sized data sets regardless of the read length or sample source and gets us ready for PacBio and Ion Torrent. It also supports our new RNA-seq workflow that maps the transcriptome directly to the genomic reference sequence.</p>
<p><strong>Improved support for Paired End (PE) read handling</strong></p>
<p>We have added exciting new possibilities to work with PE reads to the GQ-Engine. By examining all possible alignment combinations for a read pair we can keep the most likely alignment pairs for further analysis. This strategy to find &#8220;happy pairs&#8221; can be parameterized on the command line and takes into account the expected distance between the reads, the orientation of the alignment (fwd/rev strands), and the number of mismatches and indels that are needed to align the reads at those positions. All of this happens in memory while computing the alignments, and is much more efficient and exhaustive than the post-alignment processing strategies typically implemented by other read mappers.</p>
<p>Because the PE mapping strategy is fully integrated into the GQ-Engine, we have complete flexibility working with the results. We can, for example, decide to also keep the single end reads that are mapped with high confidence (we will). As well, we can dump all non-happy pairs into a separate alignment database to look for interesting things like, copy number variation or structural variations.</p>
<p>For our web interface users, using the PE read mapping strategy will be completely transparent. When you map a PE read database, we will ask confirmation of the expected insert size and read orientation. That&#8217;s all.</p>
<p><strong>Interval Indexing and Positional Based Annotation</strong></p>
<p>Interval indexing within the GQ-Engine adds the ability to very quickly find the overlap between different sets of intervals. Examples of use cases are: &#8220;find all alignments overlapping with exons of known genes&#8221;, or &#8220;find all SNPs in my data set that are already known in dbSNP&#8221;. This technology will support many use cases in the GenomeQuest platform. To start, it will speed up the existing variant annotation workflow and drive the RNA-seq workflow. More applications will follow soon.</p>
<p>The GenomeQuest 7.1 release is planned for Friday the 8th of July 2011. I hope to see you there.</p>
<p>Henk Heus, Ph.D.<br />
VP Product Management &amp; Services<br />
GenomeQuest Inc</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2011/06/upcoming-improvements-to-the-genomequest-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Implications of exponential growth of global whole genome sequencing capacity</title>
		<link>http://blog.genomequest.com/2010/07/implications-of-exponential-growth-of-global-whole-genome-sequencing-capacity/</link>
		<comments>http://blog.genomequest.com/2010/07/implications-of-exponential-growth-of-global-whole-genome-sequencing-capacity/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 13:00:09 +0000</pubDate>
		<dc:creator>Richard Resnick</dc:creator>
				<category><![CDATA[GenomeQuest]]></category>
		<category><![CDATA[Implications for Society]]></category>
		<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[Personalized Medicine]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=254</guid>
		<description><![CDATA[Illumina&#8217;s HiSeq 2000 running at capacity can sequence two whole human genomes per week at 30x coverage &#8211; enough for a full-blown whole genome analysis. One instrument produces 104 human genomes per year.
Beijing Genomics Institute alone has purchased 128 of these instruments. The Broad has 51. And based on Illumina&#8217;s 2010 Q1 10-Q filing, they&#8217;ve [...]]]></description>
			<content:encoded><![CDATA[<p>Illumina&#8217;s HiSeq 2000 running at capacity can sequence two whole human genomes per week at 30x coverage &#8211; enough for a full-blown whole genome analysis. One instrument produces 104 human genomes per year.</p>
<p>Beijing Genomics Institute alone has purchased 128 of these instruments. The Broad has 51. And based on Illumina&#8217;s 2010 Q1 10-Q filing, they&#8217;ve got a backlog that represents maybe another 200 machines. So by 2011, there may be some 500 of these machines running. Not to mention the GA-IIs, the SOLiD machines, the 454 machines, Helicos, Pac Bio, Ion Torrent, Complete Genomics, and all of the next-next generation single  molecule sequencing companies making big promises.</p>
<p><strong>The Fact Is&#8230;</strong></p>
<p>&#8230;it&#8217;s easy to lose track of what this means. It&#8217;s easy to get stuck in today&#8217;s problems.</p>
<p>In 2010, we may have something like 1,000 publicly available human genomes at a wide variety of coverage. That&#8217;s giving us as a society the benefit of the doubt.</p>
<p>In 2011, the worldwide capacity for whole human genome sequencing will easily reach 50,000 &#8211; real data based on orders that have already been placed.</p>
<p><em>Do we believe this is going to slow down? What incentives does the industry have to dial this down?</em> None that I can think of.</p>
<p>If it&#8217;s 50,000 genomes in 2011 (50x increase from 2010), it&#8217;s totally reasonable to believe that capacity will grow to 250,000 genomes by 2012 &#8211; that&#8217;s only a 5x increase from the previous year. Call 2013 a 4x increase over 2012 &#8211; that&#8217;s a capacity to sequence 1 million genomes, just three years from now.</p>
<p>The only thing in the way of this explosive growth is our ability to absorb the new capacity &#8211; and that gets directly to tools that can analyze the data. As the number of genomes increases exponentially, the types of questions we&#8217;ll ask of this data will change dramatically. We&#8217;re in the middle of an incredible revolution that will move more quickly than many of us appreciate. Let me propose one vision.</p>
<p><strong>2001-2009: A Human Genome</strong></p>
<p>The 10 or so years after the Human Genome Project, through say 2009, were characterized by large-scale research operations to understand the basic biology behind genomics. Gene and target discovery, pathway modeling, disease models, GWAS, expression analysis. Consumers of the Human Genome Project have been academic, pharmaceutical, and biotech researchers. The genome was sequenced, and sequencing was thought to be yesterday&#8217;s job.</p>
<p><strong>2010: 1,000 Genomes &#8211; Learning the Ropes</strong></p>
<p>In 2010 with the nascent adoption of NGS (if you think it&#8217;s widespread today, just wait), new applications have exploded on to the scene: larger-scale resequencing of exomes and whole genomes, RNA sequencing, CHiP-seq, metagenomic sequencing, and a renaissance in the agricultural sciences who can finally run their own versions of the Human Genome Project. The consumers of this early-stage adoption of NGS remain the academic researchers, pharma and biotech researchers, and ag companies. We&#8217;re finding new variation across different ethnicities, identifying novel transcripts in previously well-understood genes, and developing exciting new insights in epigenetics. But it&#8217;s still basic research. And the bioinformatics community is still arguing about basic approaches to alignments, calling variants, and normalizing across experiments.</p>
<p><strong>2011: 50,000 Genomes &#8211; Clinical Flirtation</strong></p>
<p>How do things change when we have the capacity to sequence and analyze 50,000 genomes? Catalogues of human variation will become large-scale for the first time. We&#8217;ll build strong correlations between phenotype, genotype, and treatments. Early-stage sequence-based diagnostics will find their way into the leading-edge labs and hospitals. Pharma will take real steps towards the design and optimization of genotype-centric clinical trials. The FDA will provide better guidances towards developing drugs and diagnostics that employ sequencing. We&#8217;ll start talking about &#8220;Genomicists&#8221; in the same way we currently describe Pathologists or Radiologists although there will be very few of them. (Indeed, some Pathologists already believe that genomics will fall in their house.)</p>
<p><strong>2012: 250,000 Genomes &#8211; Clinical Early Adoption</strong></p>
<p>With 250,000 genomes, the clinical adoption of sequence data will begin in earnest. Genomics-based diagnostics will be a real business: comments from a recent J.P. Morgan report indicate that lab managers believe that this switch will occur in the next 5 years, particularly in cancer detection and classification. The FDA will support pharmacogenomics-based clinical trials at large. Population studies will continue to drive massive insights into human variation. Leading-edge hospitals will store whole genome data for patients as a part of their medical records. The consumers of NGS are changing from academic and commercial researchers to Pathologists, Genomicists, VPs of Clinical Development in pharma, and young doctors everywhere.</p>
<p><strong>2013: 1 Million Genomes &#8211; Consumer Awareness</strong></p>
<p>When the planet has the capacity to sequence 1 million genomes per year, many 1st-world health-care consumers will have enough knowledge to seek out health-care providers who provide these services. Savvy patients, already practiced in researching their own conditions on the Internet prior to a doctor&#8217;s visit, will begin to push back on doctors&#8217; recommendations, saying, &#8220;before we make a decision on that cancer treatment, I want my genome sequenced to see whether it&#8217;ll be effective.&#8221; Health and life insurance companies will get into the game, and barring significant ethical battles, will use genomic information to guide treatments, suggest specialists, and even set prices for premiums. Diagnostics for personalized care will double from the previous year. The personal genome will be within reach to many individuals, and the FDA will struggle to keep up with regulation to restrict the use of personal genomes from unapproved diagnostics. It is not at all clear to this author whether the FDA is sufficiently staffed to keep pace with the innovation that will explode from this level of availability of sequencing capacity.</p>
<p><strong>2014: 5 Million Genomes &#8211; Consumer Reality</strong></p>
<p>Many cancers in the 1st-world will be sequenced as a regular component of a biopsy. Patterns of drug efficacy will be published and made available against different genotypes. Oncologists will work with statisticians to develop treatment programs. Hospitals will offer whole genome sequencing services to newborns. Chronic pain will be managed on a genotype-by-genotype basis. Medical schools will redesign their curricula to produce physicians and researchers to lead medicine into the Genomics Age and to provide advanced training for the Genomicist specialty.</p>
<p><strong>2015-2020: 25 Million Genomes And Beyond &#8211; A Brave New World</strong></p>
<p>The ability to sequence 25 million genomes just five years from now seems well within the industry&#8217;s grasp, barring significant issues of uptake and absorption of the data. And applying just a doubling of capacity each year between 2015 and 2020, we would have the capacity to sequence just under 1 billion genomes a year by 2020. This will have drastic impacts on society.</p>
<p>While the health-care industry will continue to adopt sequencing for broader and broader applications, the insurers will do everything in their power to get access to this information both for the microeconomic management of individuals as well as for the macroeconomic indicators of ethnic and regional health that will surely increase their profit margins.</p>
<p>Consumer applications for genomics will flower: want to see whether you are genetically compatible with your new girlfriend? There&#8217;s an app for that. DNA sequencing on your iPhone? Believe it. Personalized genomic massage, anyone? This is already happening today &#8211; see labs testing for allele 334 of the AVPR1a gene to see whether your new mate has the &#8220;cheating gene.&#8221; Then imagine the market for consumer applications and gimmicks when your entire genome is already on a USB drive.</p>
<p>Genetic discrimination may need to be addressed in the highest regulatory bodies: do you really want to elect a President whose genome suggests cardiomyopathy? Think this won&#8217;t happen? Just imagine the first candidate to release his healthy genome just like his last two years of tax returns, challenging his opponents to do the same. What will the world&#8217;s reaction be?</p>
<p>Will LinkedIn and Facebook suggest people you may be related to? Sure, they&#8217;ll probably not have your genome, but your genome will be <em>somewhere</em> in a de-identified way, sitting right next to other de-identified genomes. It&#8217;s easy to envision software to mine this data that will find your relatives and common ancestors. It may start as a medical application but it won&#8217;t be able to stay that way. Just let that software platform tell you that they&#8217;ve found a genome of someone who looks like a third cousin and provide a way to reach out to them anonymously. Welcome to ChromosomallyLinkedIn.</p>
<p><strong>Back to Reality</strong></p>
<p>I&#8217;m no futurist &#8211; most weeks I can barely tell you what my schedule is the following week. So while it&#8217;s fun to dream up the next decade, there are too many variables to get it all right and this thought experiment may be off a few years in any direction. We&#8217;re squarely in 2010, the year of the 1,000 genomes. The deeper we allow ourselves to look into the future, the less clear it becomes.</p>
<p>But one thing is certain &#8211; <em>sequencing capacity world-wide will continue to grow exponentially for at least the next 10 years</em>. This is going to happen. That means sample preparation will get vastly easier, throughput will continue to increase at a dizzying rate, sequencing costs will plummet, and the applications of sequencing will become more mass-market.</p>
<p>And most of all, it means that the software that we use to analyze sequence will need to become a lot simpler to use, and more purpose-built for specific applications. General bioinformatics frameworks are dinosaurs awaiting the impact of the meteor. In the (near) future, no one will be arguing about gapped vs. ungapped alignments. No one will be talking about Phred-like quality scores. No one will be talking about reads, even &#8211; they&#8217;ll seem like antiquated tiny puzzle pieces from a past when sequencing technology was like a nuclear bomb rather than a precision scalpel.</p>
<p>As I look ahead to develop the long-term vision for the product roadmap for GenomeQuest it&#8217;s obvious to me that our immediate-term focus must be on <strong>simple, easy-to-use, whole and multi-genome analysis</strong>. With the coming of 50,000 genomes next year, our immediate problem is supporting the absorption of this new knowledge. That means continuing to enable the processing of data as quickly as it comes off the sequencers and presenting it to end users in a way they can understand, interact with, and discover. What are all of the proteins affected by this individuals variants and what are the types of modifications we see? How does that impact disease pathways? How is this individual similar to others for whom we have treatment/outcome data?</p>
<p>Today&#8217;s consumer of genome sequencing is the researcher or clinician doing basic discovery with thousands or hundreds of thousands of genomes. But the longer-term audience is the clinic itself.</p>
<p>And I for one don&#8217;t think we have that long to wait.</p>
<p>Calling all clinicians.</p>
<p>Rants welcomed.<br />
-Richard</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2010/07/implications-of-exponential-growth-of-global-whole-genome-sequencing-capacity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Announcing ChIP-Seq Support</title>
		<link>http://blog.genomequest.com/2009/11/announcing-chip-seq-support/</link>
		<comments>http://blog.genomequest.com/2009/11/announcing-chip-seq-support/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 13:30:19 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[ChIP-Seq]]></category>
		<category><![CDATA[GenomeQuest]]></category>
		<category><![CDATA[GenomeQuest 6.0Beta]]></category>
		<category><![CDATA[Next Generation Sequencing]]></category>
		<category><![CDATA[NGS]]></category>
		<category><![CDATA[RNA-Seq]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=75</guid>
		<description><![CDATA[GenomeQuest released its ChIP-Seq workflow this week, available to anyone with a Free Basic Account inside of GenomeQuest. ]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve released our ChIP-Seq workflow this week, available to anyone with a Free Basic Account inside of GenomeQuest. Like all of our NGS workflows, it runs in two basic steps: a mapping step and a downstream analysis step. In this case, of course, the downstream analysis is a peak-finding algorithm. We chose the MACS modeling software for peak modeling. (You can see the entire workflow&#8217;s documentation <a href="http://wiki.genomequest.com/index.php/ChipSeq_Workflow">here</a>.) Integrated into the GenomeQuest Sequence Data Management platform, it outputs a heavily annotated <strong>sequence database</strong>, which can then be interactively filtered, grouped, sorted, and mined for peaks of interest. And this can all be connected to your RNA-Seq and resequencing data to get the global picture.</p>
<p>So now researchers can go from their ChIP-Seq NGS runs directly to gene-based annotation of the peaks found by their biology. Select regions of interest, or genes of interest, or peaks of a certain class, and drill down to see the actual evidence that backs up the call.</p>
<p>We&#8217;re giving away free ChIP-Seq runs to the first 100 people to <a href="http://www.genomequest.com/basic-registration/">sign up</a>.</p>
<p>As always, feel free to leave a comment &#8211; we read every one.</p>
<p>Richard J. Resnick<br />
VP Software and Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/11/announcing-chip-seq-support/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>APIs + Sequence Data Management = Haplotype Tables</title>
		<link>http://blog.genomequest.com/2009/09/apis-sequence-data-management-haplotype-tables/</link>
		<comments>http://blog.genomequest.com/2009/09/apis-sequence-data-management-haplotype-tables/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 13:30:52 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[detection variants]]></category>
		<category><![CDATA[GenomeQuest]]></category>
		<category><![CDATA[haplotype tables]]></category>
		<category><![CDATA[Sequence Data Management]]></category>
		<category><![CDATA[SNP]]></category>
		<category><![CDATA[variant calling]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=68</guid>
		<description><![CDATA[We&#8217;ve heard lots of requests from customers not only to provide them with powerful methods for detection variants across multiple experiments (or phenotypes, or organisms, or lines), but for unifying all of this data to find knowledge that spans these experiments.
Of course we have our variant calling workflow, just as we integrate with other variant [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve heard lots of requests from customers not only to provide them with powerful methods for detection variants across multiple experiments (or phenotypes, or organisms, or lines), but for unifying all of this data to find knowledge that spans these experiments.</p>
<p>Of course we have our variant calling workflow, just as we integrate with other variant calling workflows. All of these produce GenomeQuest-native browsable, mineable, and queryable databases. And because of the GQ Engine, we can easily combine sets of 10s or 100s or even 1,000s of these variant databases into a single queryable entity with &#8220;web-speed query performance.&#8221;</p>
<p>Nevertheless, while our customers get the benefit of the combined data, they often ask for more. So today I jumped in to the APIs of GenomeQuest and tried to address the simple problem of building a table of SNPs that span a series of experiments. Each SNP should have the specific allele called for each experiment in which it was found. A simple little table designed to be the input into any of a number of linkage disequalibrium mapping packages. I made a GQ Plug-in: 5 lines of code to make it accessible in the user interface, and another 100 lines of code (I&#8217;m wordy) on the back-end to build the table and present it. And so, the multi-experiment haplotype table is born. I might even convince the development team to include it in our next live push.</p>
<p>If you want to hear more or check out the code, drop me a line.</p>
<p>Richard J. Resnick<br />
VP Software and Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/09/apis-sequence-data-management-haplotype-tables/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Guiding Principle of GenomeQuest 6.0Beta Platform</title>
		<link>http://blog.genomequest.com/2009/09/guiding-principle-of-genomequest-6-0beta-platform/</link>
		<comments>http://blog.genomequest.com/2009/09/guiding-principle-of-genomequest-6-0beta-platform/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 12:30:24 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[bioinformaticists]]></category>
		<category><![CDATA[GenomeQuest 6.0Beta]]></category>
		<category><![CDATA[sequence cycle]]></category>
		<category><![CDATA[Velvet assembly]]></category>
		<category><![CDATA[workflows]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=66</guid>
		<description><![CDATA[The guiding principle of the development of the GenomeQuest 6.0Beta platform is to support the complete "sequence cycle" – from uploading raw reads, to mapping them to an arbitrary reference, to generating knowledge through a variety of workflows, and then ultimately through to the assembly of those reads for use as the reference for tomorrow's experiment. ]]></description>
			<content:encoded><![CDATA[<p>The guiding principle of the development of the GenomeQuest 6.0Beta platform is to support the complete &#8220;sequence cycle&#8221; – from uploading raw reads, to mapping them to an arbitrary reference, to generating knowledge through a variety of workflows, and then ultimately through to the assembly of those reads for use as the reference for tomorrow&#8217;s experiment. In this way, investments in sequencing are not one-off, but rather continually augment each other over long periods of research.</p>
<p>We&#8217;re getting very close to releasing a new point version of GenomeQuest 6.0Beta that includes, among other things, support for the Velvet assembly tool. This is one more step in supporting the sequence cycle – a powerful, easy to use, widely adopted assembly package. In addition to the release of the Velvet tool inside of GenomeQuest 6.0Beta, we&#8217;re going to be releasing the full &#8220;how-to&#8221; of its implementation inside of GenomeQuest 6.0Beta for those bioinformaticists and developers that want to get a close-up look at how anyone can integrate virtually any tool into the GenomeQuest 6.0Beta platform.</p>
<p>Richard J. Resnick<br />
VP Software and Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/09/guiding-principle-of-genomequest-6-0beta-platform/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CHI Conference</title>
		<link>http://blog.genomequest.com/2009/08/chi-conference/</link>
		<comments>http://blog.genomequest.com/2009/08/chi-conference/#comments</comments>
		<pubDate>Wed, 26 Aug 2009 15:50:56 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[NGS]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=27</guid>
		<description><![CDATA[The Next Generation Sequencing Data Analysis conference in Providence in September looks like it's shaping up to be a good one.]]></description>
			<content:encoded><![CDATA[<p>The Next Generation Sequencing Data Analysis conference in Providence in September looks like it&#8217;s shaping up to be a good one. Plenty of GenomeQuesters will be there as will I. We&#8217;re sponsoring a roundtable &#8211; <em>NGS Five Years Down The Road</em>. In five years time, NGS will have gone from &#8220;sexy new technology&#8221; to &#8220;established and mainstream.&#8221;  The price of sequencing and analysis will come down, the complexity and volume of experiments will go up. What will remain from our efforts today, and what will NGS analysis look like?</p>
<p>Dr. Henk Heus from GenomeQuest will represent us and I&#8217;ll hope to be there personally. Stop by and participate, and introduce yourself!</p>
<p>Richard J. Resnick<br />
VP Software and Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/08/chi-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GenomeQuest 6.0Beta Launch Update</title>
		<link>http://blog.genomequest.com/2009/08/genomequest-6-0beta-launch-update/</link>
		<comments>http://blog.genomequest.com/2009/08/genomequest-6-0beta-launch-update/#comments</comments>
		<pubDate>Wed, 19 Aug 2009 19:07:14 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[GenomeQuest 6.0Beta]]></category>
		<category><![CDATA[NGS]]></category>
		<category><![CDATA[RNA-Seq]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=24</guid>
		<description><![CDATA[The past few weeks have been exciting! The rush of users to the 6.0Beta product has kept us on our toes trying to ensure that everyone can find their way around. Our users&#8217; ingenuity is thrilling to watch. The heart of the company beats in rhythm with users who take a few hundred million reads in [...]]]></description>
			<content:encoded><![CDATA[<p>The past few weeks have been exciting! The rush of users to the 6.0Beta product has kept us on our toes trying to ensure that everyone can find their way around. Our users&#8217; ingenuity is thrilling to watch. The heart of the company beats in rhythm with users who take a few hundred million reads in color space and do RNA-Seq over a time series. It&#8217;s a really fun time to be at GenomeQuest &#8211; there are lots of discoveries waiting to be made in the pent up unanalyzed NGS data that has been pouring in.</p>
<p>Coming soon&#8230; ChIP-Seq, Velvet, much more&#8230;</p>
<p>Richard J. Resnick<br />
VP Software and Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/08/genomequest-6-0beta-launch-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thrilled About GenomeQuest 6.0Beta</title>
		<link>http://blog.genomequest.com/2009/07/thrilled-about-genomequest-6-0/</link>
		<comments>http://blog.genomequest.com/2009/07/thrilled-about-genomequest-6-0/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 12:21:47 +0000</pubDate>
		<dc:creator>GenomeQuest</dc:creator>
				<category><![CDATA[Message from Technology Team]]></category>
		<category><![CDATA[GenomeQuest 6.0Beta]]></category>

		<guid isPermaLink="false">http://blog.genomequest.com/?p=10</guid>
		<description><![CDATA[I'm thrilled about our launch of GenomeQuest 6.0Beta because I believe it'll really have an impact on researchers' abilities to make sense of their NGS data. ]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m thrilled about our launch of GenomeQuest 6.0Beta because I believe it&#8217;ll really have an impact on researchers&#8217; abilities to make sense of their NGS data. We&#8217;ve overlayed a web-based toolkit of NGS analyses on top of our world-class NGS-scale sequence data management engine, and have included all of the world&#8217;s reference data. In just a few clicks, your reads are mapped, classified, and summarized against whatever genome you&#8217;re researching. Our variant calling capability allows for such powerful global mining that you can ask questions like, &#8220;show me all of the SNPs and indels that truncate a protein and are previously undiscovered.&#8221; To be able to do that in a few clicks in your web browser is pretty cool. And while our RNA-Seq workflow of course computes RPKMs, it also gets you into discovery by letting you examine splice variants all the way down to the level of individual reads. The ability to change the scale up and down so drastically gives researchers huge insights.</p>
<p>My favorite aspect of GenomeQuest 6.0Beta is its openness. Bioinformaticians can plug in their own algorithms and make their own workflows, while at the same time leveraging all of the power of the centrally managed reference data. Researchers can share their results at the click of a button. And the whole thing is available and open to try by anyone who wants to use it.</p>
<p>We&#8217;ll keep making GenomeQuest better and better with your help, and I look forward to hearing your feedback on the product.</p>
<p>Richard Resnick<br />
VP Software and Professional Services</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.genomequest.com/2009/07/thrilled-about-genomequest-6-0/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

