3 min read

Getting Started With Bioinformatics

Getting Started With Bioinformatics
Photo by Ilya Pavlov / Unsplash

When I first heard about the field of Bioinformatics, I thought about how cool it would be to work in this field. The idea of analyzing sequencing data using computers fascinated me. I was not good at running microbiology experiments anyway, and the scope for variability with performing such experiments being really high bugged me to no end. This led me to seriously consider Bioinformatics as a career path, as you have complete control over the code you write. Plus, you can practice bioinformatics wherever you go! All you need is a good laptop.

To get a feel of what Bioinformatics was like, I attended a 4 day long workshop on Next-generation sequencing (NGS) data analysis organized by EMBL. I felt clueless and lost during the initial days of the workshop. My head would spin looking at the terminal window on a linux machine, and I would drown in fear of not ever being good at bioinformatics. But I really wanted to prove to myself that I can do it, and I kept faking it till I started making it.

Bioinformatics may seem like a daunting field to enter if you are from a wet-lab biology background; the fear of programming being the most predominant one that stops one from pursuing it. But it does not have to be this way. This fear  arises because biologists are usually not exposed to sufficient training on programming languages early on. But with everything easily accessible on the internet nowadays, it is not too difficult to train yourself in becoming a self-taught bioinformatician, although it does take considerable amount of time and effort to become proficient at it.

After the workshop, I slowly started connecting with people already in this field on LinkedIn. Most advised me to pick up a coding language such as Python, Perl or R. I did the online Python course on Codecademy. I also taught myself R programming using the website http://www.r-tutor.com/r-introduction. Both these resources were helpful before I started taking graduate level courses on Bioinformatics at my university. The link http://www.ee.surrey.ac.uk/Teaching/Unix/ is a good resource to learn the basics of Unix, which is essential if you want to analyze NGS data.

The most important thing about learning Bioinformatics skills is to be consistent with learning. Keep doing the above courses without getting distracted by anything else. Do not give up too early either. Keep at it and see it through.

Once you become familiar with these resources, it is time to get yourself involved in real Bioinformatics projects. I gained experience by working at a Bioinformatics lab at my grad school. But this is not required. You may also follow online tutorials for analyzing RNA-seq data; https://github.com/griffithlab/rnaseq_tutorial for instance. You may then apply the code you learnt from such tutorials to analyze a publicly available dataset on GEO. This project will help you apply your R and Unix skills you learnt from the tutorials.

To apply your Python skills, I would suggest writing a simple text parser function. The field of Bioinformatics is filled with myriad of file formats such as SAM, FASTQ, GTF, BED, MAF and so on. A good project would be to write a function that reads in a BED file and prints out coordinates belonging to chromosome 1. Or try parsing a FASTQ file and creating a dictionary with the sequence ID as the key and the sequences as values. If you want to advance your programming skills in python, https://rosalind.info/problems/locations/ and https://leetcode.com/ are links worth checking out.

If you feel like your momentum is decreasing at times, it helps to follow bloggers who write about their coding experience, or to follow programmers on twitter. It also helps to reach out to them with any questions you may have if you get stuck on an issue.

So do not lose hope! Definitely pursue Bioinformatics if you feel excited and passionate about it and I promise you, you will not regret it. You will feel proud of what you are accomplishing. My journey of transitioning from a wet-lab biologist to a Bioinformatician was not an easy one, but was totally worth it. The view is great after you surmount the steep learning curve.