COURSE
BENEFITS:
PROFESSOR
CHING-YUNG LIN:
Dr. Ching-Yung Lin is with Graphen, Inc. He was the IBM Chief Scientist, Graph Computing, and an IBM Distinguished Researcher. He established and led the Network Science and Machine Intelligence Department in IBM T. J. Watson Research Center. He has been an Adjunct Professor in Columbia University since 2005, and was an Affiliate Professor in the University of Washington from 2003 to 2009 and an Adjunct Professor in New York University (NYU) in 2014.
Dr. Lin was elevated to IEEE Fellow in Nov 2011, the first IEEE Fellow in the area of Network Science. He is an author of 200+ publications and ~50 awarded patents. In 2010, IBM Exploratory Research Career Review selected Dr. Lin as one of the five researchers "mostly likely to have the greatest scientific impact for IBM and the world." His "Big Data Analytics" course in Columbia University was the Top 1 search result of Baidu search on Big Data Analyticss.
In 2012-2015, he led a team of ~40 researchers from Columbia University, CMU, Northeastern Univ., Northwestern Univ., UC Berkeley, Stanford Research Institute, Rutgers Univ., Univ. of Minnesota, and NMU in the largest US social media analysis project including 26 tasks from 2012 to 2015. In 2015, he was invited to be a panelist together with the White House Chief Data Scientist in the semi-annual conference of the American Medical Association. He was invited as a keynote speaker in 20+ conferences, including the Expo 2.0 in New York Javits Convention Center in 2009. He was among the earliest researchers driving Machine Learning in Computer Vision, initiated the first large scale video annotation project by 111 researchers in 23 worldwide institutes in 2003. His work won 7 best paper awards and was featured 4 times by the BusinessWeek magazine, including being the Top Story of the Week in May 2009, the Best Paper Awards on ACM CIKM 2012 and IEEE BigData 2013.
Dr. Lin's team is now focusing on building novel software platform to simulate functions of brains to build various AI solutions in all kinds of industrial sectors, including Finance, Medicine, Energy, and Automotive.
APPLICABLE
DEGREE PROGRAMS:
COURSE FEES:
Lecturer/Manager: |
Prof. Ching-Yung Lin |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Day & Time
Class |
Friday 7:00pm - 9:30pm |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Location: |
Mudd 833 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Credits for course: |
3 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Class Type: |
Lecture |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Prerequisites: |
This will be a hands-on course. Students need to know at least one or more programming languages: Python, C, C++, Java, Perl, and/or Javascript to finish homeworks and final project. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: |
With the advance of IT storage, pcoressing, computation, and sensing technologies, Big Data has become a novel norm of life. Only until recently, computers are able to capture and analysis all sorts of large-scale data from all kinds of fields -- people, behavior, information, devices, sensors, biological signals, finance, vehicles, astronology, neurology, etc. Almost all industries are bracing into the challenge of Big Data and want to dig out valuable information to get insight to solve their challenges. This course shall provide the fundamental knowledge to equip students being able to handle those challenges. This discipline inherently invoves many fields. Because of its importance and broad impact, new software and hardware tools and algorithms are quickly emerging. A data scientist needs to keep up with this ever changing trends to be able to create a state-of-the-art solution for real-world challenges. This Big Data Analytics course shall first introduce the overview applications, market trend, and the things to learn. Then, I will introduce the fundamental platforms, such as Hadoop, Spark, and other tools, e.g., Linked Big Data. Afterwards, the course will introduce several data storage methods and how to upload, distribute, and process them. The course will go on to introduce different ways of handling analytics algorithms on different platforms and systems. Then, I will introduce visualization issues and mobile issues on Big Data Analytics. Moreover, students will learn introductory AI-related big data technologies, such as Generative AI and Large Language Models. Students will have fundamental knowledge on Big Data Analytics to handle various real-world challenges. Students will choose the topics of their own for a final project. The application domain can be based on the students' own interest. This will be a good opportunity for students to apply what's learned in the class for their needs, either for the future work requirements or for the research problems at hand. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TAs(CAs/Graders): |
Apurva Patel (amp2365) and Linyang He (lh3288)
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Required Textbook(s): |
None |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reference Textbook(s): |
class notes, and reference books or papers |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Homework(s): |
Five assignments (HW#0 - HW#4) including programming and written reports. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Project(s): |
Final project in which students conduct research and hands-on implementation for self-selected topic on Big Data Analytics. Team collaboration of up to 3 students is encouraged. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Paper(s): |
Reports for each homework and the final project result. Oral presentations of the final project proposal, intermediate presentation and final presentation results are required. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Midterm Exam: |
None |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Final Exam: |
None |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Grading: |
Five homework assignments: 50%, Final Project (proposal, intermediate and final presentations, report, open source code, and presentation video): 50% |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Hardware |
PC with Internet access. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Software |
Students may use their preferred software (Python, Javascript, or C/C++, Java, Perl) on their computers with Google Cloud to complete homework assignments. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Homework |
by submission through Columbia CourseWorks |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Course Outline |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Lecturer: |
Ching-Yung Lin |
||||||
|
|
||||||
Day & Time
Class |
Friday 7:00pm - 9:30pm |
||||||
Location: |
Mudd 327 |
||||||
Credits for course: |
3 |
||||||
Class Type: |
Lecture |
||||||
Prerequisites: |
This will be a hands-on course. Students need to know at least one or more programming languages: C++, Java, Perl, Python, and/or Javascript to finish task milestones and final project. |
||||||
Description: |
The Big Data Analytics area evolves in a speed that was seldom seen in the history. New Software and Hardware tools are emerging and disruptive. Furthermore, its boundary with Artificial Intelligence becomes blurring. We may no longer find a clear distinction on what is a Big Data Analytics problem and what is an AI problem. In this Advanced Big Data Analytics course, we will devote to something new -- "How far could we achieve to build a brain that mimics human functions through the state-of-the-art computer science and electrical engineering technologies?" What we would like to discuss is not machines that play Games (Chess, Question & Answering quiz, or Go) or recognize voice and face, but how machines could possibly achieve what are unique to the human beings. Our brains can reason, can associate, and can memorize. We have feeling, emotions, ethics and morality, arts, and consciousness. We dream during the night. In this course, students will conduct Research and Development on the tasks that shall collectively contribute to building intelligent machines that are like human, or more knowledgable than human, through analyzing Big Data. Most lectures will be divided by two parts. The first part will be the presentations by Prof. Lin or guest speakers to explain the potential Computer Science / Electrical Engineering technologies for building such machines. The second part will be students' presentations on their progress in the 4 areas: (1) Cognitive Robot, (2) Robo-Advisor, (3) Healthy Life, and (4) Advanced Artificial Intelligence. |
||||||
TAs (Graders): |
Shiyu Wang (sw3601)
|
||||||
Required Textbook(s): |
None |
||||||
Reference Textbook(s): |
class notes, and reference books or papers |
||||||
Homework(s): |
None |
||||||
Task: |
Each student will need to sign in a task in one of these four areas: (1) Full-Brain AI, (2) Financial Advisor, (3) Healthy Life, and (4) Green Earth. Each area will have 15 tasks. Task lists will be announced in Lecture 1. Each task will have three milestones. Each milestone includes programming, presentation and a written report. |
||||||
Project: |
Final project in which students define a Big Data Analytics application and apply the software built in any combination of the 60 tasks in the class to accomplish the project. Team collaboration of 2 students per project is encouraged. |
||||||
Paper(s): |
Report for each milestone and the final project results. Oral presentations of each milestone and the final project. Source codes will be submitted to a course Github repository. Final project will also include a video presentation. |
||||||
Midterm Exam: |
None |
||||||
Final Exam: |
None |
||||||
Grading: |
Task Milestones: 45% (3 milestones including presentation and report, each milestone: 15%), Advanced AI Study Presentations: 15% (3 presentations, each one: 5%) Final Project (presentation, report, source code, and video): 30%, Class Participation: 10% |
||||||
Hardware |
PC with Internet access. |
||||||
Software |
Depending on the task, students will need to use appropriate software (C++, Java, Python, Perl, and/or Javascript) on their computers to complete the task milestones and the final project. |
||||||
Task |
by submission through a course website. |
Class
|
Lecture Topics |
Student Presentations |
|
01/19/24 |
1 |
|
|
01/26/24 |
2 |
|
|
02/02/24 |
3 |
Full-Brain AI (I) & Green Earth (I) |
|
02/09/24 |
4 |
Financial Advisor (I) & Healthy Life (I) |
|
02/16/24 |
5 |
Advanced AI Study |
|
02/23/24 |
6 |
Advanced AI Study |
|
03/01/24 |
7 |
Full-Brain AI (II) & Green Earth (II) |
|
03/08/24 |
8 |
Financial Advisor (II) & Healthy Life (II) |
|
03/15/24 |
SPRING BREAK |
||
03/22/24 |
9 |
Advanced AI Study |
|
03/29/24 |
10 |
Cognition |
Advanced AI Study |
04/05/24 |
11 |
Full-Brain AI (III) & Green Earth (III) |
|
04/12/24 |
12 |
Financial Advisor (III) & Healthy Life (III) |
|
04/19/24 |
13 |
Edge AI |
Advanced AI Study |
04/26/24 |
14 |
Advanced Artificial Intelligence |
Advanced AI Study |
05/03/24 |
15 |
|