Journey through Data Science with the Data Professor
An Interview with Chanin Nantasenamat
No two paths into data science are alike. While some enter with a clear focus on working with data materials and processes, others may join through unorthodox avenues. In an industry that is relatively newer when compared to different technology spaces, many professionals are introduced to data analysis practices through peripherally relevant, sometimes totally unrelated disciplines.
In the case of bioinformatics researcher, professor, content creator, and now developer advocate Chanin Nantasenamat, PhD, his entry into data science happened before the field had fully formed. Presented with a research opportunity early in his academic career as a bioinformaticist, Nantasenamat noticed quickly how data could inform, enrich, and drive his research.
Since this first foray into the data science world through his academic work in biology,health informatics and computational drug discovery, Chanin Nantasenamat has recently switched gears. As a content creator known as the Data Professor, he has been able to amass a global following through instructional videos on YouTube designed to prepare the next generation of data-minded professionals.
- How Chanin Nantasenamat Chose Data Science
- Realizing the Potential of Data Mining
- How Social Media Promotes Data Science Learning Opportunities
- Practicing Data Community Building
- The Intersection of Data Analysis and Developer Advocacy
- Building Data Literacy Skills
- Joining the Data Science Community
How Chanin Nantasenamat Chose Data Science
As a student, Nantasenamat began his academic career with field research in biology on the diurnal migration patterns of copepods. While some parts of his early work collecting research went smoothly, other aspects proved to be more challenging. He “found an advisor for undergraduate research,” he said, “and then went into the field, collected data, but still needed to analyze the data.” This last step would motivate his interest in integrating biological research methods with data modeling and analysis.
With this new enthusiasm for a different yet complementary branch of science, he went on to pursue a PhD and pursued his data collection practice normally. But after this step was complete, he hit a roadblock. “At the time I had a log of data on fluorescent proteins,” he told us, “and so the only way to make sense of all of this data was to perform some form of analysis.”
Nantasenamat had not yet been trained and didn’t have the access to information as easily so early in his academic career, so he decided to attend a conference on a new and unfamiliar research method: data mining. “I met a professor who has just graduated with a PhD from the US and did his research on data mining” he said of the acquaintance who introduced him to the then up-and-coming field.
Equipped with a new focus and motivation to learn these new skills, Nantasenamat got to work. “Based on self-study and loose supervision,” he told us, “I was introduced to the field of data mining. And I applied data mining to analyze the fluorescent proteins.” The results in his biological research were fascinating. Through effective predictive models based on data he had collected and drawn conclusions from, he and his team “were able to predict the spectral properties (colors) of the jellyfish fluorescent proteins.”
This innovative and interdisciplinary approach would prove to be a turning point for his bioinformatics research and would direct many of his future scholarly and professional endeavors. More importantly, Nantasenamat was able to show the effectiveness of combining biological research and predictive modeling to make sense of data.
“We could digitize that information into qualitative and quantitative numbers and then we could use data mining, or data science as it’s known today,” he said, “to make a prediction on the color of the jellyfish’s fluorescent protein.” This was a major breakthrough for Chanin and his team of researchers: “It’s kind of like if you could digitize a physical phenomenon and then, with enough data points, you could build a prediction model.” The results from this study was published as a research article entitled Prediction of GFP spectral properties using artificial neural network in Wiley’s Journal of Computational Chemistry.
Realizing the Potential of Data Mining
While working with datasets has existed really since the beginning of computing, the industry as we know it is a relatively new development. As processing and computing technology has only increased in speed, sophistication, and efficiency, both data science and data mining have become better resourced and better recognized as legitimate avenues of research, business, and innovation.
Through that lens, Chanin Nantasenamat got into the industry before it was even an industry. “Back in the day, data science was pretty much the same as innovative machine learning algorithms,” he offered. In other words, data mining and “deep learning wasn’t really mainstream at the time.”
These kinds of algorithms and structural concepts in the field would, however, begin to take a new shape with applications across fields and industries. As an educator promoting the benefits of data mining in biological and pharmaceutical research, Nantasenamat also began to take note of how data science techniques were already benefiting business. “The CRISP-DM concept,” he noted, “which essentially means to have a business understanding, to process the data, build a prediction model, and then use the prediction model to drive back the decision-making process.” Through this model, Nantasenamat understood that research methods that benefit hard scientific pursuits could have effective use functions across business endeavors.
Data mining and working with data generally is only successful when other – sometimes unexpected – areas of focus are incorporated. “Aside from just data modeling aspects,” Nantasenamat told us, “there are other soft skills that are involved like data storytelling, communication, and putting data into context.” With these tools, data science professionals are able to deliver high-value and efficient decision-making processes to whatever organization they work for.
“Nowadays, data science is much more than just building models and getting high-accuracy metrics. It’s discovering how you can take that model, add value to it, and drive change in terms of the business use cases?”
How Social Media Promotes Data Science Learning Opportunities
In the same way that data science has evolved through the increasing sophistication of computer technology, communication and human connection has unalterably changed through the advent of social media. Through these developments in technology and communication, social media has begun to play a massive role in the way data scientists and data professionals are able to connect and stay informed on developments in the field.
For Nantasenamat, his introduction into the data science social media space was inspired by someone at home. “My daughter would just jokingly say ‘Why don’t you create a YouTube channel to talk about what you’re doing?” he told us. The innocent suggestion made Nantasenamat consider new ways to use his training and talents to reach new people who were interested in data science practices.
In August 2019, he decided to start recording. Though he recalled that making the first video in the middle of the night and finding it “very awkward to talk to yourself in front of a camera,” he rapidly became more comfortable in his content creation and digital instruction. The experience “led to a series of other videos,” he said, “and a cataclysmic chain of events that eventually led to the channel growing as a result of collaborations with many other content creators.” Through his introduction to content creators and social media personalities (Johnathan Ma, Ken Jee, Tina Huang, Ravit Jain, Charly Wargnier, Francesco Ciulla, etc.), Nantasenamat was also able to enter into an inviting and active data science community.
Practicing Data Community Building
In addition to his concentration on offering interactive and engaging instructional video content to people interested in gaining new data science skills, Chanin Nantasenamat has also realized his new role in data community building. For both newcomers and for veterans of the industry, Nantasenamat understands that a thriving data community will be able to connect professionals of practically all skill levels.
These communities can certainly meet in physical spaces but also “could be on various social platforms like Twitter, LinkedIn, or Facebook,” he said. In these digital arenas, though, data science professionals of all experience thresholds can exchange information and connect with each other much more quickly. When people connect and collaborate through these different digital channels, the instructional settings begin to look a little familiar for Nantasenamat. He posits that there’s a certain overlap between the “medium of a YouTube channel and the medium of a typical classroom.”
Data community building should be viewed as an umbrella concept. Between introductions to new people with similar data interests, collaborations that inspire other data professionals, and opportunities for feedback on data-driven projects, data communities are structured to be inviting. Additionally, Nantasenamat finds that data communities are able to help each other learn different data platforms and software. “If you think of it,” he offers, “the entire data ecosystem and its tools are pretty much complementary.”
These different tools take different shapes and appeal to data professionals with different focuses. For example, some choose to work with the data pipeline or ecosystem, while his focus is more on “extracting data, transforming data, loading data so I might focus on deploying models.” Still others “might focus on ensuring the security of the data or the storage of the data.” Through these different avenues of practicing data science, Nantasenamat believes that the result can have profound, community-building effects.
“In spite of the heterogeneity of the elements in the data ecosystem,” he said, “I believe that [a data community] could form more or less likely a Symphony. You have strings, you have woodwinds, you have percussion. But if you combine everything together, you get a harmonious Symphony.” In the same way that bringing different instruments and music groups together sets the stage for a compelling artistic expression, different branches and concentrations of data science can inform and support each other. “But,” Nantasenamat clarified, “the thing is you need a conductor, right? You need a conductor to orchestrate everything playing together in harmony.” In other words, leaders in the field have a responsibility to share what they know, in order to cultivate talented, motivated newcomers.
As the data community continues to grow, Nantasenamat as one of the space’s leaders recognizes the importance of adapting to new changes in the field and continuing to learn. “It doesn’t mean that we know everything,” he said. “We’re also learning with the community, and it’s more or less like learning in public. We’re sharing what we’re learning, and the cycle repeats.”
The Intersection of Data Analysis and Developer Advocacy
Data analysis and data science can lead to new and unexpected career options that still drive innovation in the field. One such option that Nantasenamat chose to pursue is developer advocacy, a role that invites motivated professionals to marry a knowledge base of technical concepts with a focus on community outreach. Through the success of his YouTube channel and because of his new, much wider scale involvement in the data science community, Nantasenamat was able to leave his role as a professor of bioinformatics and become a developer advocate for Streamlit.
The new role has enabled him to build on the skills he cultivated as a data scientist and as a professor. Specifically, Nantasenamat defines developer advocacy as a “role which pretty much is the interface of developers and the company.”
As a developer advocate, his responsibilities encompass a wide range of areas. “The role of developer advocacy or developer relations,” he stated, “would be to give talks at conferences, to make tutorials either in the form of a YouTube video or in the form of a blog.” These front-facing responsibilities are ultimately familiar to Nantasenamat: “This role is similar to being a university professor or a teacher, where you take concepts that are in textbooks or in any discipline or domain knowledge of interest, and then relay that to the end user, which” in a classroom setting “are the students.”
“I’m seeing that there’s more or less like an overlap between developer advocacy and the role of a teacher,” he said.
Building Data Literacy Skills
Becoming proficient in data science and analysis certainly poses unique challenges for different people with different skills. One of the most important aspects for Nantasenamat in his data science instructional and educational videos is to manage expectations among those learning new skills and new modes of thinking. This expectation management principle can be tied to understanding the purpose and focus of data literacy.
Based on Rahul Bhargava’s chapter on The International Encyclopedia of Media Literacy, there’s really no universally accepted definition of data literacy. Still, Bhargava offers that data-informed literacy skills build off of “abilities to acquire, analyze, represent, and argue with data.”
For Nantasenamat, data literacy has a more fluid definition that spans different tiers of understanding and proficiency. “Becoming data literate at a higher level would mean being able to use data to drive informed decisions that could be as basic as being able to make use of transformed data in Microsoft Excel to make useful graphs, plots, and data summary tables.”
By employing an approach that incorporates data analysis and data science to practically every appropriate situation, “decisions are then not based on guesses or emotions,” according to Nantasenamat. Instead, stakeholders and scientists alike “have data to back their choices up and are able to use that to drive their decision-making processes.”
Joining the Data Science Community
While practically all of the content that Chanin Nantasenamat creates caters to data professionals of all skill levels, one of his central goals is to bring budding analysts into the field. This objective is linked to the overall inviting, inclusive nature of the field that Nantasenamat recognizes. “I believe that anyone who has an interest in data or the potential of data could break into data science,” he encourages, “no matter how old you are, no matter what domain or field of study you’re coming from, even those who are afraid of programming or coding.”
While he recognizes there will inevitably be a stark learning curve, he believes that anyone with enough determination and focus can pick up the skills necessary to join the community and industry with relative ease and great potential. Foundationally, Chanin Nantasenamat believes that newcomers should keep a level head when approaching the highly technical field. “It’s just a tool,” he offered, “and although there’s a lot of technology that’s involved, my advice would be: Don’t get lost in this.”
Just as Nantasenamat visualizes the data community as the sum of its parts in the form of a symphony, he acknowledges that diving into the field can look a little scarier. “Imagine sailing through a storm in the ocean. If you don’t have a compass, it’s very difficult to navigate through the storm. The weather’s turbulent, so in order to move through the storm, you need a good compass.” That compass for Nantasenamat is the immersive and engaging training that he promotes in his YouTube content and that you can find in the online classroom.
“With practice and with time, coding and machine learning algorithms you might need to use, become apparent and clearer in the moment. If you spend enough time, it’ll eventually click.”
One of the best ways to gain the skills necessary for success is through getting a degree in data science. Fast-tracked, intensive bootcamps and certification courses can also be helpful entry points for many aiming to redirect their careers. A comprehensive data science program, however, will offer a balance of technical, programming-focused skills along with a concentration on critical thinking and interpersonal communication. Learn more today about which program best fits your personal and professional goals and start working toward your future in data science.