What I have learned from building a pharmacogenomics platform using data analytics tools
As a student at the UC Davis MSBA program, I got the unique opportunity to work with industry partners on real business problems for our practicum project. It has been a great journey to work with scientists from CPMC-RI on their Cancer avatar project to help identify the best treatment custom to each patient’s genes and type of cancer. Our team has worked hard knowing that this is more than just a part-time project, but something that can help people who struggle with cancer and create a social impact. It is almost midway through our last quarter and we are close to the end of this project as well. In this blog, I want to talk about what I have learned from this project that helped me be more prepared for being a real data analyst.
We used a lot of tools that are accustom to data analysts, such as SQL, python, and tableau, to create a data pipeline that automatically cleans and uploads data from scattered excel files, consolidate them into the same database, calculate related metrics, and finally put it together into an easy to use dashboard to help scientists and physicians decide on what is the best treatment for a certain patient. The platform combines data from both drug responses and genomic profiles to understand how tumor samples react to drugs based on their genes. Through this 9-month project, we were able to apply concepts we learned in class, we also faced some challenges along the way, which made this project a good practice and learning opportunity.
We are the second team from the UCD MSBA program to work with CPMC-RI, which means part of our work was based on previous year’s framework, and it can be difficult trying to stay consistent and keep the updates to a minimum while changing some of the functions. This issue made us aware of the scalability of our platform, so we asked about additional data sources and applications that may be incorporated into the platform in the future, and revised our data structure to make sure that it can be expanded to include those data. Through this design we hope the platform can be continuously used and updated, and provide our MIP the analytical power to combine different sorts of data and conduct various analysis.
Our industry partners (MIP) from CPMC-RI have been very responsive and incredibly helpful. However, as medical scientists, they are not the best people to discuss technical details, nor are they familiar with most of the tools we used. When asked about what they want to achieve in the final product, they didn’t give very clear requirements. On the other hand, as part of Sutter Health, CPMC-RI does not have its own data team, so the technical support only comes from a few of Sutter’s data analysts who are already pre-occupied by a lot of other projects. As a result, the team was lost and confused about what we should do next. After consulting with our mentor and professor, we decided to just start working on something so some progress can be made and we can improve from our MIP’s feedbacks. This started the agile process of our dashboard development as we have been iteratively reviewing and revising our design during every meeting. It also made the team more involved as we are not just waiting for instructions but taking the lead in the designing process. It definitely helped us brainstorm a lot of creative features to add to the dashboard.
Being able to solve the problem creatively and proactively is one of the essential qualities of data analysts. As there are many potential applications of data analytics, chances are we don’t know where to start or the analytics model doesn’t work as planned, similar to how we had a difficult start designing the dashboard. Not being able to conclude useful insight from data is not uncommon and I have learned to be flexible and experiment with different approaches. It is also important to communicate with the team and brainstorm about new ideas to bring in different perspectives. On the other hand, although as data analysts, we need to dig deep into the details of data, it is also important to have a strategic view of the big picture. Oftentimes data analysts are working with people from other teams such as marketing or product, to provide insights that can be eventually put into action. In order to achieve that, we should keep their needs and goals in mind, rather than coming up with the most accurate model that does not really apply to the problem.
It was a great feeling of accomplishment when we hear our MIP say “I think this can reduce our processing time from hours to minutes”. Our MIP is very happy with the platform we built and we are also very proud to be a part of a research project that can potentially save people’s lives and revolutionize how cancer treatment is provided. We are in the last two months of our project and we want to finish strong. We hope that this platform can provide the additional enhancement to CPMCP-RI’s research process and help them be more efficient and conduct analysis that they were not able to before. Furthermore, we hope that this would be the start of a continued effort of incorporating data analytics into their work through collaboration with the UCD MSBA program.