How High Schoolers Can Start Learning Data Science
Data Science offers one of the most promising long-term and rapidly growing career fields of the last decade. Technological advancements surrounding how data is mined and analyzed have transformed data into a valuable commodity for a variety of fields from finance to healthcare to academic research. Part computer scientist, mathematician, and trend spotter, data scientists are responsible for interacting with data in these areas to support informed industry decisions. As such, there is a growing need for up-and-coming generations of students to learn how to effectively interact with data. Much like biology, chemistry, and physics, high school students should start to build a strong foundation in data science early in their educational training to support progressive proficiency in data science later.
Current Paradigms in Data Science Training
Presently, most students are only exposed to computer science or computational thinking during high school and first interact with Data Science after entering college or in the workforce. Computational thinking, however, is only one piece of the data science puzzle. Moreover, computer science and technical knowledge requirements are constantly evolving. As the ability to mine large amounts of data becomes more feasible, it also becomes more critical for the next generation of students to learn how to analytically and practically interact with larger data sets. Updating current curricula to reflect these technological paradigm shifts to include data science provides high school students with an initial toolbox that they can use to build additional skills throughout college and their careers. A strong early foundation in data science provides students with a variety of skills that extend beyond technical computer knowledge. Effective data scientists explore data trends to answer pressing questions and are often called upon to explain their results to non-technical audiences. As such, students enrolled in data science courses can learn how to apply their curiosity in a result-oriented fashion as well as to develop critical communication skills. While not all students will become data scientists in the traditional sense, many careers intersect with data science and scientists. For instance, multidisciplinary project teams at a biotechnology firm may pair data scientists with product engineers and managers. If the engineers and managers have an existing foundation of data science knowledge, team cross-communication regarding data analytics can be made more effective and productive. Thus, a variety of career paths benefit from an early introduction to concepts in data science.
Why Should Students Start Learning Data Science in High School?
A 2019 report, by the National Association of Manufacturing and consulting firm Deloitte predicted that while the U.S. will create 3.5 million STEM jobs by 2025, approximately 2 million jobs will go unfilled due to a lack of skilled workers. In particular, these jobs will require workers with adept technological, computer, and critical thinking skills. Early exposure to Data Science during high school provides a constructive environment for students to build a foundation in all three areas. Research has also shown that early exposure to math and science courses in high school encourages students to pursue higher education in STEM and continue their studies even as the material becomes more challenging. With particular regard to data science, employment for data scientists is anticipated to grow by 16% by 2028—faster than the average of all other professions. As demand for new and improved data-mining technology increases, employer demand for highly skilled data scientists will also increase. Data scientists currently enjoy excellent employment prospects as numerous employers report difficulty recruiting suitably skilled workers. Growing up in a technologically-rich world has already fostered the current high school student generation toward exploring complex questions using data made available by current technologies. Natural data collectors, students already collect and make informed decisions based on data from their everyday lives. For example, students use technology and social media to collect data on which celebrities are starring in the latest superhero blockbusters, predict their peers’ interest in the films, and use this information to intelligently decide which film to watch in the theater. Early, formal training in data science also affords students another hands-on, practical training experience in communication and collaboration skills. As most careers are becoming integrated with data mining and big data applications, the next generation of workers will need to understand data science applications in order to effectively communicate with colleagues. Analytical synthesis and presentation are two key components of communicating data science. Within a formal data science course, high school students can learn how to communicate effectively to a variety of audiences in different contexts using mock presentation exercises:
- One-on-one. Here, the speaker engages with a single stakeholder. The goal is to convey a specific message using a compelling story that is supported by relatable facts.
- Small group discussion. Small group discussions may be a board, team, or private pitch meeting. Board or pitch meetings are formal, concise and direct while team meetings are less formal and more collaborative in nature.
- Classroom/Training (Medium audiences). The goal of classroom or training style presentations is to communicate a specific message in a relevant and attention-grabbing manner. Classroom style presentations should be well-organized in an easy to follow and memorable style: introduction to the central question answered by the data, clear and concise explanation of the data, and summary of the data’s explanation of the question.
- Conference (Large audiences). Conference audiences differ from classroom/training audiences in that the goal will be to incorporate brand building alongside clear and concise data presentation.
Don’t High Schools Already Teach Data Science?
Most high schools teach introductory computer processing and computer science, and some have also incorporated lessons on the basics of newer technologies. Unfortunately, few high schools possess curricula dedicated to learning data science. Technological advances and the evolution of how society interacts with technology is continuously evolving. Updating high school curricula to reflect these changes is necessary to prepare the next generation to work in the global economy.
How Should We Teach High School Students Data Science?
Effectively teaching high school students data science begins by setting clear and effective curriculum goals. These goals will, in turn, dictate course staffing requirements for the defined curriculum. While coverage of technical topics is expected, course curriculum should be designed to balance technical and non-technical topics. Every student will have a different level of initial interest in the topic and varying proficiency in technical and non-technical skills. As such, too much emphasis on teaching technical expertise early in the course may lessen the interest of non-technologically savvy students. Some key curriculum ideas are as follows:
- Understanding the importance of Data Science. Introductory data science courses should include overviews and discussions of data science applications in various contexts.
- Exploring career options within Data Science. Discussions of career options within data science should complement current job creation and hiring trends within the field.
- Learning how Data Science applies to other industries and career fields. High schools should collaborate with local universities and businesses to expose students to hands-on, practical applications of data science.
- Identify real-world examples and applications of Data Science. Course assignments can include projects where students use data science tools and analytical techniques to real-world answer questions. Project endpoints may include detailed analytical reports and presentations.
- Routine technical and programing examinations. Technical examinations to evaluate student comprehension should be included in the data science curriculum, though exams do not necessarily need to be the main focus of the course.
When technical training becomes necessary in the progression of the course, proper programming tools and corresponding teaching tools are key. At minimum, high school data science curricula should include an introduction to newer programming languages like Python, R, Scala in addition to C, C++, and Java. It is noteworthy, however, that some of these professional grade tools are not always student or teacher friendly. Fortunately, instructor-friendly programs are also available to provide accessible, hands-on learning opportunities for students. Bootstrap, for example, allows instructors to build and adapt their own software and programming tools for Data Science courses. Bootstrap can cover introductory programming approaches, various chart visualizations, and core statistical concepts. In the classroom, teachers can incorporate these tools with application modules such as business or social studies research and allow students to explore real-world questions of interest to them. As discussed above, working within the field of data science requires more than technical knowledge and experience. Data scientists routinely call upon communication and collaboration skills in day-to-day operations. Much like other high school classes, data science coursework should be integrated with other school courses like writing and public speaking to provide a well-rounded training experience in data science and help facilitate career preparation. Assignments centered on interpreting data sets may be associated with any coursework that requires creative problem-solving. For example, compiling detailed written data analysis reports and presentation of analytical reports to an audience. Written, oral, and visual presentation of analytical data encourages students to practice critical data analysis and handling of data outliers, explore surprising or unexpected data trends, and grapple with more complex data relationships. Once a course curriculum is established, qualified teaching faculty should be onboarded with the particular curriculum goals in mind. While data scientists may seem like the best choice in general for teaching data science course, data scientists without an established teaching background may lack in-depth knowledge about the variety of career paths and options in data science. This is because the particulars of a data scientist often vary from position to position depending on the specific industry. Thus, data science faculty with teaching experience should be sought to lead data science courses, as they would know about a variety of data science elements, applications, and career subdomains such as the following:
- Data Scientist. Data scientists’ source, manage, and analyze unstructured data to identify important unanswered and address defined questions. Data scientists also use storytelling and data visualization skills to synthesize and communicate their results to key stakeholders thereby facilitating strategic decision-making events.
- Data Analyst. Data analysts use programing, statistical, and mathematical skills to organize and analyze data to answer their organizations’ strategic questions. Like data scientists, data analysts are also responsible for communicating their findings to organization stakeholders.
- Data Engineer. Data engineers specialize in the development, management, utilization, and optimization of data pipelines designed to handle exponential data volumes. These data pipelines are critical in transporting data to data scientists for processing and analysis.
Collectively, the right combination of non-technical and technical curriculum goals and effectively experienced teaching faculty will form a strong training course designed to shape the skilled STEM workforce of tomorrow.
Summary
Data is a growing commodity in a variety of industries from business and finance to healthcare and academic research. As technology advances and provides easier, broader access to vast and complex data sets, the demand for skilled data scientists to manage and analyze the data will only continue to increase. The current generation of high school students have the advantage of growing up in a technologically rich, globally focused world—making them ideal candidates for early training in data science techniques and concepts. Introducing students to data science early in their educational careers provides them a solid foundation in key concepts and affords students the ability to adapt with and understand changing technologies as they are applied to Data Science. Even if students choose not to pursue an outright data scientist position, a working knowledge of data science concepts and applications enhances their soft skill set and competitive edge when they finally enter the job market.