ORIE Initiates a Financial Data Science Certificate Program: FDS@CFEM
At a time when the ability to use “big data” analytics and machine learning algorithms has become in great demand, ORIE has introduced a new certificate in Financial Data Science (FDS) that augments and complements the established Master of Engineering concentration in Financial Engineering (MFE for short). The certificate is optionally available to MFE students as a semester following their fall semester at Cornell Financial Engineering Manhattan (CFEM), which in turn follows the first year of the program, held in Ithaca.
Analytics and machine learning respond to the explosion of data and computer processing power available in many areas of endeavor, including financial services. The Institute for Operations Research and the Management Sciences (INFORMS) defines analytics as “the scientific process of transforming data into insight for making better decisions.” In 1959 IBM’s Arthur Samuel defined machine learning as “giving computers the ability to learn without being explicitly programmed,” and applied an early version to the game of checkers. Historically, the field of statistics had to develop ways to derive better decisions from a paucity of data and processing power, but these limitations no longer constrain the analyst who is skilled in contemporary data science.
The curriculum for ORIE’s FDS@CFEM certificate combines computing techniques and machine learning with hands-on projects brought by leading financial practitioners. The curriculum was designed by a committee of experts to equip students with mastery of machine learning and computational “big data” technologies so they can solve real, large-scale finance problems. A select group of students has been recruited for the inaugural semester.
By design, all FDS@CFEM students take the same tightly integrated set of courses, which are bound together through a “practicum” course. They are participating in three courses, one on machine learning, another on “big data” computing techniques, and a practicum that integrates the knowledge gained in these courses with their earlier FE studies through a series of practitioner-led projects incorporating real financial data.
“The idea is that as students learn different concepts in the classroom, they assimilate the material and put it into practice tackling real problems with the help of practitioners,” according to M.Eng. Director Kathryn Caggiano.
“In a typical program, students study self-contained topics in ‘silos’ of courses taught by instructors who independently develop their syllabi, outcomes and deliverables with little or no collaboration, leaving students to figure out how, when, and why to combine these concepts on their own,” Caggiano said. Instead, FDS@CFEM provides students with “content that is integrated with practical experience and a realistic context in which to apply it.” Moreover for FDS@CFEM, “not only are the instructors planning and designing their courses collaboratively, but they are communicating with each other throughout the delivery process to make sure that students understand the how, when, and why of applying the concepts to real problems.”
The computing techniques course is taught by Patrick Steele, currently a fifth-year ORIE PhD student who has extensive experience working with large data sets at eBay and in his research on routing and scheduling. Steele’s course includes such practical topics as interacting with website displays to extract data, using relational and non-relational data bases, using the Python computer language to explore datasets, and creating reliable software and maintaining control of successive versions.
The machine learning course is taught by Professor Huseyin Topaloglu, currently at Cornell Tech in New York City. There he is also working to launch the new ORIE Master of Engineering (MEng) program that begins in Fall 2016. Like FDS@CFEM, the Cornell Tech MEng confronts making decisions at massive scale and at rapid speed, in the Cornell Tech case at companies like Amazon, Google, Lyft and Netflix. Topaloglu’s course “emphasizes machine learning as a computational tool to develop large-scale decision support systems to drive business intelligence,” according to the syllabus. To do so, the course covers a broad range of statistical methods, from regression and Bayesian methods to classification and neural networks.
Like the existing third MFE semester at CFEM, the certificate program takes advantage of the New York City location of CFEM to involve practitioners, from Wall Street and elsewhere, in projects. “Practitioner engagement is at the heart of all CFEM programs, and FDS@CFEM was designed with that spirit in mind,” said CFEM Director Victoria Averbukh. Among these practitioners are some who are able to bring data sets for use in projects. According to ORIE Senior Research Associate Sasha Stoikov, who leads the FDS@CFEM practicum course as well as a biweekly colloquium that also provides interaction with guest speakers from the industry, “these ‘quants’ are at the center of the FDS practicum and give our students important insights into the business applications of these data sets.”
Like the existing third MFE semester that the FDS@CFEM students have just completed, which entails an intensive team-based consulting project for a financial services client, the FDS semester derives “its greatest value from exposing students to current industry practices via an interaction with successful industry professionals,” according to the committee proposal that led to the creation of the certificate.
However, unlike these third semester projects, in the FDS@CFEM practicum all of the FDS students pursue all of the projects and these projects are much more structured, according to Stoikov. Even the timing of the courses reflects the coherence of the curriculum: for example, since all of the FDS practicum projects require advanced handling of data, the bulk of the technologies course is taught as a “bootcamp” in nine six-hour sessions held during a two-week period at the beginning of the semester.
In the inaugural semester, all of the practicum projects involve large data sets provided by practitioners, each of whom uses a series of lectures to introduce the problems.
- Marcos Lopez de Prado, a Senior Managing Director at Guggenheim Partners, has assigned practical problems related to corporate bond data traded at high frequency.
- Arseniy Kukanov, a Vice President at global quantitative investment management firm AQR Capital Management, has assigned practical problems based on the electronic record, or ‘order book,’ of buyer and seller interest in specific stocks.
- Richard Yeh, a quantitative analyst at BondEdge Solutions, a division of Interactive Data (which was recently acquired by Intercontinental Exchange Data Services) has introduced the students to the use of Twitter data to infer investor sentiment in predicting returns, unusual trading volume, and volatility. In anticipation of this project, during the bootcamp phase of the technology course, the students used what they were learning to build software that continuously monitors Twitter for any mention of stock ticker symbols such as $MSFT and $AAPL. Their programs run 24/7, recording all tweets containing around 400 different symbols in a database for later analysis in the program.
ORIE MEng Director Kathryn Caggiano and CFEM Director Victoria Averbukh recruited the inaugural set of FDS@CFEM certificate students last spring from the cadre of MFE students who had been scheduled to graduate at the end of 2015, asking “do you want to be prepared for an industry landscape that is shifting?” and “are you ready for a challenge?” Four students were selected who are now pursuing their fourth semester:
- Yue (Luna) Cheng is a business administration, finance and applied math graduate of the Emory University Goizueta Business School in Atlanta, GA. Less than two months after the program began, Cheng was asked to update her resume. “I thought there is not much to tell,” she said, but “when I updated my LinkedIn profile and wrote down what we had learned, it surprised me. We were introduced to and worked with different databases I have never heard of before: MongDB, postresql, hd5f. We use web scraping and Twitter streaming to work with online and social media data. We also learned different machine learning techniques like regression and classification. I believe that before graduation we can learn a lot more,” she said.
- Shuaijia (Dora) Dai is an actuarial science and mathematical statistics graduate from Purdue University in West Lafayette, Indiana. In the third Financial Engineering semester, Dai did a project with high frequency trading platform provider Lucera in which “I had to familiarize myself with high frequency data and perform statistical analysis on it. Through that project, I realized that there was so much I did not know about dealing with complex unstructured data,” she said. This, and “being able to master new statistical learning algorithms, a lovely thing to me,” motivated her to pursue the FDS@CFEM certificate, Dai said. “Eventually I will have a nice website demonstrating all my projects this semester, so that will be really cool,” she added.
- Shaojie (John) He is a math and economics graduate of Peking University in Beijing, China. Last winter, he headed the team of MFE students that tied for first place in the fourth annual Academic Affiliate Membership Student Competition sponsored by the International Association for Quantitative Finance. “During my job interview, most people from quant positions are very interested in my experience related to machine learning,’ He said. “The machine learning course is taught by a great professor [Topaloglu] who leads us to go over all the major algorithms and drill into the mathematical details, which provides us with a solid understanding of the models,” he reported, noting as well that “the financial data practicum is taught by experienced practitioners who give us excellent projects to work on and detailed guidance.”
- Yuhong (Irene) Shi is a financial engineering graduate of Wuhan University in Wuhan, China.
All four students acquired experience as interns in New York City last summer. Instructor Steele said “I'm impressed with the students: they were eager to learn and did well. Sometimes they'd try to keep working through lunch until we reminded them to stop.”
Fast track implementation
“Creation of FDS@CFEM is an example of the entrepreneurial nature of ORIE and CFEM, and our dedication to making sure that students receive cutting-edge and relevant practical education,” said CFEM Director Averbukh.
According to Averbukh, the idea for expanding into Financial Data Science originated at the 2014 CFEM Advisory Council meeting. There, Shmoys raised the idea and solicited feedback from Council members. Everyone responded with enthusiasm and excitement. By April 2015, a committee chaired by MEng Director Caggiano had created the proposal that led to the creation of FDS@CFEM. Members included Averbukh, ORIE Assistant Professor Andreea Minca, Shmoys, Stoikov and Topaloglu. They gathered data from students and practitioners, formulated a curriculum, and outlined the steps necessary for the program to be implemented by Spring 2016 -- as it has been.
“We are in a golden age of machine learning and ‘big data’ technologies,” said Stoikov. “Interest in applications of these tools to finance has grown tremendously and our committee realized that our financial engineering students would benefit from strong exposure to this exciting field,” he said.
Additional information about the program, including a video overview, can be found here.