Education

University of Chicago (UChicago), Chicago, IL

Expected Graduation: December 2023
Master of Science in Computational and Applied Mathematics
• Overall GPA: 3.86/4.00
• Key Courses: Big Data, Machine Learning and Large-Scale Data Analysis, Scientific Computing with Python, Applied Linear Algebra, Modern Applied Optimization, Mathematical Computation IIB: Nonlinear Optimization

University of California, Los Angeles (UCLA), Los Angeles, CA

Graduated: June 2022
Bachelor of Science in Applied Mathematics with Specialization in Computing
• Overall GPA: 3.87/4.00
• Key Courses: Machine Learning, Probability and Statistics, Mathematical Modeling, Numerical Methods, Optimization

Certified Specializations

Google Data Analytics

• Issue Date: June 2023
• Credential ID: 7TAHUXLNSGXA

Learn SQL Basics for Data Science

• Issue Date: June 2023
• Credential ID: PNWSTGUV7LZS

Python for Everybody

• Issue Date: July 2023
• Credential ID: QFSLPRBMX2CW

AI+Science Summer School 2023

• Issue Date: July 2023

Industry Experiences

Internship

UChicago Research Computing Center
September 2022 – September 2023
Data Analytics: Chicago Booth Rental Apartment Discrimination Data Webscraping
• Executed an effective web scraping pipeline using Python's Beautiful Soup to mine historical rental apartment data from the Wayback Machine's web archives.
• Integrated Named Entity Recognition (NER) strategies into the pipeline, streamlining the data extraction process and reducing errors from intricate HTML structures.

Research Computing: High Performance Computing Clusters Midway2 and Midway3
• Proactively responded to technical queries and troubleshooting requests from users of Midway2 and Midway3, ensuring optimal functionality and performance of these dedicated supercomputers for UChicago research.
• Updated the Midways' user guide, including updating obsolete content and refining navigation, which increased the guide's usability and facilitated users in promptly locating the needed information and solutions.

Data Research Intern

East China Normal University
June 2021 – November 2021
• Transformed the image-processed data from facial recognition features into an interactive network of movie characters visualized in Python NetworkX as part of a data science team for a Python-based film analysis application.
• Managed the data pipeline including the design, population, and maintenance of a SQL database containing character IDs and timestamps, successfully piloting the system with the film "12 Angry Men," generating 28,000 data points for validation.

Quantitative Analytics Intern

East China University of Science and Technology
June 2021 – September 2021
• Deployed a Flask web application with SQL database, focusing on enterprise risk assessment and trading recommendations.
• Improved prediction accuracy by 5% using Python Regression and Random Forest, with over 2,000-enterprise training data.

Big Data Projects

Rental Apartment Discrimination Analysis

Chicago Booth
Spring 2023
• Engineered a Python web scraping pipeline using Beautiful Soup to extract over 10,000 historical rental apartment records from the Wayback Machine archives, achieving around 100% extraction accuracy lastly, and loaded the cleaned data into a SQL database as practice.
• Integrated Named Entity Recognition strategies to optimize data extraction.

Hotel Reservation Cancellation Analysis

Chicago Booth
Spring 2023
• Collaborated in a diverse team to conduct a thorough exploratory analysis and data transformation on a hotel reservation dataset, leading to the identification of key variables impacting cancellation patterns and major influences on hotel pricing.
• Led the implementation of diverse statistical and machine learning models, including regression, PCA, Random Forest Classifier, and causal lasso, to eventually ensure with a 90% high-accuracy predictions regarding hotel cancellations, which provided strategic insights to minimize hotel revenue losses and enhance customer behavior understanding.

MNIST Bayesian Classification

University of Chicago
Spring 2023
• Utilized the Expectation-Maximization (EM) algorithm to generate Bernoulli mixture models for class-specific features in the MNIST dataset, focusing on efficient matrix-vector multiplications for rapid convergence.
• Applied Bayes' Rule for classifying testing data with 90% accuracy and monitored log likelihood for algorithmic convergence.

Binary Image Classifications of Cats and Rabbits

UChicago
Fall 2022
• Implemented and compared three different binary classifiers based on the ideas of Linear Discriminant Analysis, Logistic Regression, and Naive Bayes to identify the animal in the given set of digital images as a rabbit or a cat.
• Transformed image-to-data processing by edge detection involving wavelet transform by Python Pywavelets module.
• Utilized Python Scikit-Learn module to build, train, and test machine learning models of Logistic Regression and Naive Bayes.

Fraudulent Job Posting Prediction

UCLA
Spring 2022
• Invented a detector to recognize real and fake job postings based on a Kaggle open dataset with 18,000 multi-genre job postings in texts and 18 features, such as job locations and benefits, aiming to help users to distinguish the real jobs efficiently.
• Generated a grid search cross-validated Quadratic Discriminant Analysis model and achieved around 98% recall scores on training and testing datasets, where the numerical data was extracted by NLTK bag-of-word models from the raw data.

Los Angeles Travel Planner Application

UCLA
Spring 2021
• Crafted a user-friendly Flask web application catering to LA tourists, enabling them to generate a personalized, dynamic GeoPy route map swiftly according to their travel itinerary, offering accurate and current travel suggestions tailored to the user's needs.
• Leveraging Beautiful Soup, over 20,000 tourism-related data entries were scraped from TripAdvisor, then structured and converted these HTML files into an SQL database, culminating in the creation of a user-search-driven online database.

Penguin Classifications

UCLA
Winter 2021
• Built a 98% accuracy decision tree model and a 100% accuracy multinomial logistic regression model to identify penguin species.
• Utilized cross validation methods to find the optimal training features and evaluated the trained models by decision region plots.

Student Organizations

Vice President of Business Department

Chinese Students and Scholars Association
July 2019 – June 2021
• Executed successful negotiation and communication strategies to secure sponsorship contracts with approximately 20 businesses across various industries.
• Facilitated and led an in-person welcome event for new students, efficiently addressing their concerns and queries, showcasing strong communication, problem-solving, and decision-making skills in an ever-changing environment.

Member of Career Development Department

Southwestern Chinese Students and Scholars Association
September 2020 – August 2021
• Orchestrated an online career fair, connecting 20 companies with around 500 potential student candidates, reflecting the ability to handle ambiguity and drive independent workstreams within larger team projects.
• Utilized social media to create and manage effective promotional content for the career fair, attracting a viewership of around 2000, demonstrating digital marketing acumen, and acted as a liaison between companies and students, managing communications, and solving any arising issues, demonstrating excellent interpersonal and conflict resolution skills.

Member of Operations Development Department

United States Chinese Elite Consortium
February 2020 – November 2020
• Collaborated in a highly visible project team for the "Chinese-Youth Cloud Summit", which amassed over 765,000 views and garnered extensive media attention with 237 U.S. reports.
• Contributed to the team's success through effective teamwork, project management skills, and innovative strategies to maximize the reach and impact of the event.