Machine Learning

Join us on Wednesday, June 9th, as we learn about machine learning/artificial intelligence, computer science, and its impacts on our society! We'll be exploring the intersection of machine learning and math modeling, and try to do some machine learning of our own. Hope we'll see you there!

View the Video Here
View the Problem Here

What is Machine Learning?

Machine Learning is a field of computer science that teaches computers how to make decisions and predictions. Data scientists and computer programmers essentially feed machines data and use code to teach them new patterns and analyses.

Why should I be excited about Machine Learning?

Machine learning (ML) is everywhere– from a roomba learning the most efficient routes to clean a room to amazon echo or alexa answering your questions to self-driving cars that can navigate the road safely. You can even use machine learning to understand what a cat is trying to say to you! But how does machine learning work in the first place?

The magic behind machine learning is a lot of data. We give data to the computer so it can learn new patterns to build a machine learning (ML) model. This process is called training. Then, we give data the computer hasn’t seen before and feed it through our ML model, to see if it performs the way we expect it to. This process is called testing.

For example, let’s say we want the computer to understand the difference between dogs and cats. To us, this may seem like an easy task. However, to the computer, an image of a cat means a lot of numbers - to be specific, 120x120x3 for a RBG (color) image that is 120 pixels wide by 120 pixels high. We need to build a machine learning model and train the computer to understand these images. So, we give the computer a lot of images of dogs and cats with labels so it can detect patterns and learn to distinguish the two. After the training phase, we are now ready for the testing phase. We give the computer an image of a cat it hasn’t seen before - and see if it can detect whether it is a cat or a dog. Thanks to advances in computer processors over the past decade, computers can now store and process hundreds of thousands of data points, allowing us to have strong machine learning models.

There are many fields of machine learning - among these include natural language processing, reinforcement learning, and image classification. The focus of this workshop is on image classification- a type of machine learning that teaches robots what an image means in real life. For your math modeling problem, you will be asked to build a machine learning model that can solve the task of image classification.

What do machine learning scientists do?

Machine learning scientists work on solving problems that improve human lives. They program neural networks and build machine learning models to solve such problems. For example, machine learning scientists apply image classification to X-rays to identify whether someone could have Lung cancer or not. This makes it a lot easier and quicker to form a life-saving diagnosis.

Machine learning scientists have also made facial recognition possible! From using your face to unlock your iPhone to catching criminals and tracking hundreds of passengers at airport security, facial recognition is integral to protecting privacy and maintaining security.

Machine learning scientists are also developing algorithms to help turn self-driving cars into a reality. Autonomous vehicles rely on ML algorithms to learn how to avoid obstacles, safely navigate busy roads, and protect their passengers from harm.

Affectiva, a company based in Boston, is already using image classification in real life. Affectiva develops Convolutional Neural Networks or CNNs to recognize faces and facial expressions and ensure the safety of the driver or passengers in taxi services such as Uber or Lyft.

Affectiva uses image classification to recognize a driver’s emotions. For example, if a driver on the road looks tired or sleepy, Affectiva can alert the driver to stay awake so that they can stay safe. This same program can understand when a customer is satisfied or upset with their taxi service, based on their expressions. And finally, the most important way the company is using image classification is to determine whether the traveler is actually safe inside the car. If the passenger looks scared, then Affectiva could alert police to follow the car to ensure the safety of the passenger.

Machine learning scientists at Disney’s animation studios use image classification to improve their digital characters! By running programs that detect different faces and emotions, Disney animators can check how realistic their characters are and make sure that everyone keeps their face in the middle of action scenes or other dynamic movements. Here you can see an example of what that looks like. The green squares around each character indicate that a program has detected a face at that given location! This information can help animators detect what exactly shows up on-screen.

Finally, ML Scientists at Uber ATG, a branch of Uber dedicated to autonomous vehicles, is using image classification in self-driving cars. Self-driving cars are cars that are able to drive on the road by themselves without help from a driver. With image classification, self-driving cars can see other moving cars and pedestrians around them so they can navigate roads safely. Self-driving cars also use image classification to detect road signs and the color of the traffic signals so they know when to drive, slow down, or stop. This can tell drivers how to drive safely, where to turn, and what parts of the road to avoid.

All of this is possible because ML scientists also collect and sort through a lot of data. You need thousands of ML Scientists to create reliable data, sift through it, label each data point, and make sure it doesn’t have any biases (ie. that the dataset covers all possible cases) in order to have strong machine learning models.

How does math modeling apply to machine learning?

Machine learning, at its core, is just math modeling! The different layers, functions, and datasets used to train a neural network create an equation a computer can use to analyze different inputs, or in our case, turn images into arrays of numbers into labels.

Think of it like this: while mathematicians create models to clarify parts of how the world work, machine learning and neural networks are trained to find hidden patterns in the world to be just as, if not more accurate than, the human brain.

So after several rounds of training, testing, and validation, the final neural network is nothing more than a function--a mathematical process where an input transforms into an output. Each layer adds another statistic or probability that gets translated into code for the neural network to understand and learn from. And as programmers, we get to choose which mathematical techniques to use when creating our model.

Helpful Resources

About Machine Learning

What is Machine Learning?, What is a Neural Network?, What is a Convolutional Neural Network?, What is a Recurrent Neural Network?, Natural Language Processing for Machine Learning, What is Reinforcement Learning?

About Machine Learning and Math Modeling

What is Image Classification?, A Quick Tutorial on Image Classification

Famous Machine Learning Speakers/Scientists

Andrew Ng, Fei Fei Li, Lex Fridman, Michael I Jordan, Sebastian Thrun

Machine Learning Careers

What do Machine Learning Engineers Do?, Top Career Paths in Machine Learning

Glossary

Classify (in machine learning): to label a given input datapoint
Computer science: a STEM field dedicated to studying computers and the process of computation, both theoretically and practically, by applying math, coding, engineering, and logic principles.
Convolutional neural networks: a type of neural networks which is specifically designed to analyze pixels for image classification.
Data: units of information--often numeric--that are collected through observation and used in analysis and scientific exploration.
Epochs: (in machine learning) the number of times a neural network passes through the entire training dataset
Image classification: (in machine learning) the process of categorizing, labeling, and defining images based on specific rules and patterns extracted by a neural network.
Machine learning: a field of computer science that builds models to teach computers how to make decisions and predictions as independently as possible by extracting patterns from data.
Model: a system, pipeline, formula, or other representation used to describe or imitate a real world process.
Neural networks: (in computer science) a computer science model programmed with a collection of nodes and parameters to mimic the neurons in a human brain.
Parameters: a measurable characteristic whose value helps define the programmed model and the conditions it uses to produce results.
Research: The process of doing experiments and studies in order to formulate hypotheses, conclusions, and theories about the world.
Shuffling (in machine learning): a technique used to reorder data in a dataset, which helps ensure that data remains random and as representative of the real world as possible.
Testing set: The set of data separated out from your initial dataset to test your model to calculate its unbiased accuracy and other metrics.
Training set: The set of data separated out from your initial dataset used to train your model.
Validation set: The set of data separated out from your initial data used to test your model to calculate its unbiased accuracy and other metrics, and then tune the model’s parameters to make it better.
Weights (in machine learning): the learnable parameters in a machine learning model that help define the strength of a connection between two layers of analysis.

Helpful Computer Science Terminology

Algorithm: A series of steps or a procedure.
Array: A type of data which stores numbers in columns, rows, and higher dimensions. with dimensions that contains elements of the same kind. Generally, arrays can have elements reordered, added, or removed.
Boolean: An expression that returns either true (1) or false (0) and can help programs decide which blocks of code to run or ignore.
Bug: A flaw in the code that causes a program to produce an incorrect or unexpected result. Bugs can be removed with additional research, new lines of code, or trial-and-error “debugging”.
Class: A collection of functions used together for a specific purpose. In object-oriented programming like Python (used in this event), classes provide a template for creating objects with initial values.
Compiler: A computer program which changes code written in a certain, often high-level programming language into another one. Compilers are usually used so that high-level languages like Python can be transformed into languages that can be directly understood by a computer.
Data type: A type of data, like integers, floats (decimals), arrays, lists, or booleans.
For-loop: A line of code that allows the code inside of it to be run a certain number of times.
IDE: (Integrated Development Environment) Any kind of application that allows you to write, edit, and run programs.
Memory: Digital storage, or “space,” that allows a computer to collect data about a program as it runs. Oftentime computers have fixed memory, which can also limit what a program can do.
Method/function: A named procedure that carries out a given task on certain objects. Methods and functions can either be pre-defined by the programming language or written by the coder within the file.
Object: A variable, data structure, function, or method that has a value assigned to it in the program. Objects can have data stored in them or methods enacted on them.
Run-time error: An error that is detected before the program is even executed.
Variable: A given or symbol that names a specific piece of data and allows methods to be applied to it independently of other pieces of data.
Program: An executable script/collection of code that runs on a computer program (noun) is executable software that runs on a computer. It is similar to a script, but is often much larger in size and does not require a scripting engine to run. Instead, a program consists of compiled code that can run directly from the computer's operating system.

Video Citations

Video Production Credits

Featuring: Clarise Liu, Rumaisa Abdulhai

Researchers and Script Writers: Clarise Liu, Garima Prabhakar, Rumaisa Abdulhai

Editors: Clarise Liu

Images Used in Our Video

www.stockunlimited.com/vector-illustration/robot-confused-about-bird_1428004.html

www.iberdrola.com/innovation/machine-learning-automatic-learning

stanfordmlgroup.github.io/projects/chexnext/

medium.datadriveninvestor.com/machine-learning-on-facial-recognition-b3dfba5625a7

www.hyundainews.com/en-us/releases/2887

www.affectiva.com/

en.wikipedia.org/wiki/Walt_Disney_Animation_Studios

www.wired.com/wiredinsider/2019/12/deep-learning-disney-sorts-universe-content/

eng.uber.com/uber-atg-iccv-corl-iros-2019/

medium.com/@feiqi9047/the-data-science-behind-self-driving-cars-eb7d0579c80b

www.quora.com/How-is-AI-similar-different-to-the-human-brain

www.kdnuggets.com/2016/11/intuitive-explanation-convolutional-neural-networks.html/3

medium.com/@itsmescottb123/what-is-a-neural-network-a8cfd5b18dc0

colab.research.google.com/?utm_source=scs-index

en.wikipedia.org/wiki/Keras

www.i2tutorials.com/what-is-the-difference-between-training-dataset-testing-dataset-validation-dataset-what-is-the-common-ratio/

www.nctm.org/Publications/Mathematics-Teacher/2016/Vol110/Issue5/Mathematical-Modeling-in-the-High-School-Curriculum/

Resources Referenced in Our Video

wp.wpi.edu/touchtomorrow

mmmjam.github.io

colab.research.google.com

keras.io