Colleges incorporate data science into curriculums
- — 10 October, 2012 16:15
Colleges have noticed the strong interest companies have taken in data science and are incorporating the field into their computer science curriculum. Here are a few examples of the many data education efforts being made at universities.
University of North Carolina, Charlotte
Two years ago, the university's College of Computing and Informatics and Charlotte businesses formed the Charlotte Informatics Partnership to determine how data science fits into business operations, says Yi Deng, the college's dean.
"Data is not an issue," he says. "It's the information and the insight embedded in the data that really are the issue. And those factors are becoming a principal drive for all industries. But if you look around the country, very few universities have the programs in these areas."
The public-private partnership aims to develop programs and practices that address corporate data science needs while turning Charlotte into an informatics hub. Understanding the connection between data and business requires a broad view, says Deng. To address that, the partnership will explore how data science fits into a firm's culture, investments and employees, in addition to what education the workforce requires. The input from executives of Charlotte-based companies, which include Bank of America, home improvement retailer Lowe's and utility provider Duke Energy, was used by the school to help craft its data science curriculum.
So far, the partnership has yielded master's and doctorate programs in bioinformatics, which was first offered two years ago. This fall the college launched a professional science masters in health informatics. In the next one to two years, the college will tackle the business and finance field by rolling out a professional science master's program in business analytics and informatics, created with the university's college of business, and an undergraduate financial services informatics program.
The college opted for this track since industries "across the board" have said that they need employees with a deep knowledge of analytics who also understand management concepts, Deng says. "It's extremely rare to find people who understand both. The goal is to train people who are technical but who can also lead, manage and work with others."
Though informatics is known in health care circles, the term remains "somewhat unfamiliar" to students who may also be unaware of the field's career opportunities, Deng says. Given enterprise IT's strong interest in big data, he expects this to change.
"Some communication needs to be done in certain areas. But with the recent publicity, those things will spread very quickly."
Northwestern University in Evanston, Illinois, entered the data science education field this semester with a full-time, 15-month master of science in analytics program offered at its McCormick School of Engineering and Applied Science. Interest in the program proved strong and attracted hundreds of applicants for only 30 slots, says Diego Klabjan, the program's director and an associate professor of industrial engineering and management sciences.
"The goal is to have a highly selective, high quality program so that's why we're not going for numbers," says Klabjan. "We don't want to have 60, 70, 80 or 100 students."
Ideal candidates hold a bachelor's degree in an analytics-related field, such as statistics, computer science, economics or industrial engineering, and have between two and five years of work experience in analytics, he says. Most of the 33 students in the first class meet these parameters, says Klabjan, adding that there are exceptions in the group such as recent college graduates and IT professionals with more than a decade of work experience.
The university went with a master's program because analytics is a professional method of science and an undergraduate degree delivers the required foundation, says Klabjan. Teaching the fundamentals, like basic data modeling or statistics, would require a program longer than 15 months, he adds.
The program is divided into thirds and teaches the IT, science and business aspects of analytics, says Klabjan. Some topics covered in the IT portion include data warehousing and workflow management. Science courses will look at machine learning and data mining, among other topics. The business curriculum includes communication, project leadership, which gets a full course, and conveying information to business users. Elective courses based on vertical industries like marketing and health care are also offered. Students will spend their summer completing a mandatory off-campus internship before returning to campus for the final quarter.
To learn the software behind big data, students attend a software boot camp a few days before the program starts during which they're trained in IBM's SPSS and Cognos, Tableau, Hadoop and SAS products.
Beyond the classroom learning, students will serve as de facto consultants and work on business-sponsored data projects. During the first quarter, students start an eight-month project and complete a capstone project in the final quarter. Students, divided into teams of four, will interact directly with participating companies during weekly meetings and be responsible for delivering a completed project.
"These are projects that the companies have on their to-do lists," says Klabjan. "Some don't have analytical solutions in-house. Or they're interested in starting [an] analytics practice within their organization. The deliverable of this project is always going to be the implementation, so companies will use these projects."
University of Washington
An immediate and strong demand for a new kind of data analyst led the University of Washington in Seattle to introduce a data science certificate program this October. These analysts use their strong math and technology backgrounds to build data analysis tools and then use that data to make business decisions, says Ed Lazowska, the director of the University of Washington eScience Institute, in an email.
Focusing on training IT professionals instead of undergraduate or graduate students helps meet market needs faster, he says.
Applicants should be proficient in programming, databases and basic statistics, Lazowska says. To test these skills, they must take an online quiz. Their score, along with their resume, are considered when determining admission.
The nine-month, three-course program "filled up faster than any new certificate program in recent memory," writes Lazowska. An August information session for the program, which is offered through the university's Professional and Continuing Education department, reached its 175-person capacity the week it was announced.
The first course teaches students about the different tools used in data management, storage and manipulation, including instruction in and hands-on experience with Hadoop and MapReduce. The second course deals with understanding core statistical and machine learning techniques, including deploying and maintaining a Hadoop cluster. The final course will combine the lessons from the earlier modules and teach the technology that data workers use to find value in the data sets. Each course will address interpreting and communicating statistical results, says Lazowska.
"This area is not typically emphasized in technology-focused curricula, but is critical in data science." He adds that each course also addresses data visualization, especially the first one, which has many lectures on the topics.
Three big data experts, one from the University of Washington's faculty and two Microsoft employees, will teach the course.
In the spring, the course will go online when it is added to the curriculum of Web-learning company Coursera. Lazowska hopes to eventually offer the course to students in the College of Engineering. In the on-campus version database instructors from the computer science and engineering department will teach the course.
Data science has appeared in the undergraduate and graduate introductory database curriculum as well. The courses now include subject matter on NoSQL and scalable data processing, writes Lazowska. The university is looking into developing a full-time, on-campus master's program after a "banner year in recruiting" gave it the necessary faculty, including four prominent professors in the big data and machine learning fields.
Another certificate program is also being developed around courses that are relevant to data science and already taught at the university. Examples are courses on scalable systems, databases, statistics and visualization, Lazowska says.