EMBL-Women-2019

Maria-Theresa, an EMBL Teen was invited to join Malvika Sharan, a computational Biologist at EMBL Heidelberg, to learn about bio-computational research and possible career paths.

View the Project on GitHub malvikasharan/EMBL-Women-2019

Open Science & Unix

Post by Malvika Sharan, 2019-08-02

PREVIOUS POST RETURN HOME NEXT POST

Maria-Theresa arrived at my office this morning (on 30 July 2019) and hesitantly asked if we could do some coding today. I was indecisively planning to show her how to use Git from the command-line, but her request was a good opportunity for me to first introduce her to the command-line computing, which will also form a basis for her to learn Git in coming days.

We started by downloading Git bash (Git for Windows) on her computer. I couldn’t resist talking about Open Science while we were going to use Open Source products such as Git and Git bash for the rest of the day.

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Git bash is a program for different operating systems designed to execute on a Unix style command-line environment.

Open Reproducible Research, EU FOSTER

I also shared Bio-IT‘s Open Access Unix material and briefly told her how this material has been developed openly with the help of several community members. As a result, not only has it been improved in the last 7 years, but it has also become part of a sustainable resource that is maintained and reused by others.

We had a session based on The Carpentries teaching style - Show, Don’t Tell. We went through some of the important commands starting from date and whoami to cd, pwd and ls, then explaining directories and paths, to using commands with arguments, and then using dangerous rm with careful -i. In the process of learning Unix, we organized her pre-existing Python scripts in a folder and downloaded the famous Iris dataset to further manipulate and play around with using other commands like cut, sort, uniq, grep, and piping. Maria-Theresa could review these lessons on her own whenever I had to go into my meetings or deal with other tasks.

Our group at work has a ritual of getting a coffee together after lunch and chat about other things that we are currently doing or find interesting. Today, we were talking about publications, specifically how Open Access plays a role in bridging the gaps between academia and public, developed and developing countries, wealthy institutes and institutes that have limited funding to pay for expensive journals, and why it is unfair to have an uneven distribution of knowledge based on economic power. My colleagues kindly included Maria-Theresa in the group ensuring that they were simplifying some of the concepts for her before diving into the conversation.

A random encounter in the cafeteria with Emilia Esposito, a post doc at EMBL (who explained how wet-lab experiments can be reproduced and how gel electrophoresis works), and the EMBL archivist Anne-Flore Laloë (standing at the back).

Throughout the day I also encountered several colleagues and friends at EMBL who I would stop by to chat about something that we are either together working on or to exchange a few friendly words. Since I didn’t want Maria-Theresa to think that I spend most of my day talking to people (which is not completely incorrect!), I justified why we need to connect with people at work or in a community. I plugged my recent talk on ‘Inclusiveness in Open Science’ where I had mentioned that there is a misbelief in academia that we don’t need to talk about our personal life such as identity, orientation, or mental health because they don’t directly affect science. I not only disapprove of this mentality but actively try to work towards creating space for the members of my community where they are encouraged to share these aspects and feel completely included. If we don’t allow our colleagues to be open, they can never fully feel valued and accepted for who they are, which in turn negatively contribute to their mental health as well as their scientific performance.

In the final hour, she learned about the for loops in Unix that can automate our repetitive tasks and generate reproducible results. I introduced her to the Open Access learning materials of Software Carpentry’s on Unix and Python to refer in the future. During the day, we had also talked about misinformation and fraud in science (which my supervisor Toby Gibson loves to expose, watch him talk about it). In that context, we carried on with our discussion on the importance of reproducibility in science, which motivates us to design our projects where we can work openly. This approach can address the issues of reliability of results in scientific publications. If we publish our data, raw files, and code with our research paper, we allow others to verify our work, reuse our resources in their projects and advance science in the right direction.

Our day ended with us deciding to visit EMBL archive next week (outcome of a random chat I had with Emilia and Anne-Flore) and continue learning Unix and git in the light of openness, reproducibility, inclusiveness, and traceability of the scientific claims.

PREVIOUS POST RETURN HOME NEXT POST