Shenghan Zhou

Education

Johns Hopkins University

Sep 2025 - Jun 2027

Master of Science in Engineering in Computer Science

I will enroll in the MSE program in Computer Science at Johns Hopkins University in Fall 2025.

Chongqing University

Sept 2021 - Jun 2025

Bachelor of Engineering in Artificial Intelligence

GPA: 92/100, 3.848/4.0︱Major Ranking: 1/63︱Grade Ranking: 3/253

University of California, Irvine

Jun 2024 - Aug 2025

Program: UCInspire

The UCInspire program is a 3-month program designed to provide international students with the opportunity to do research at the University of California, Irvine. The program provides students with the opportunity to collaborate closely with faculty members, gaining hands-on research experience that supports both academic and professional growth. In this program, I work with Professor Xiaohui Xie on a project related to human motion generation. Finally, I got GPA: 4.0/4.0

University of California, Berkeley

Aug 2024 - Dec 2025

Program: Berkeley Global Access (BGA) Program

The Berkeley Global Access (BGA) Program is a four-month academic program that offers international students the opportunity to enroll in advanced undergraduate and graduate-level courses at the University of California, Berkeley. Beyond coursework, the program fosters academic immersion, cross-cultural exchange, and professional development through seminars, networking events, and access to Berkeley’s world-class research environment.

I have enrolled in the following courses:

CS 180: Intro to Computer Vision and Computational Photography | Instructor: Alexei (Alyosha) Efros | Score: A- | see my project page
CS 188: Intro to Artificial Intelligence | Pieter Abbeel & Igor Mordatch | Score: A
CS 168: Intro to Computer Vision and Computational Photography | Instructor: Alexei (Alyosha) Efros | Score: A-

Publication

CoMA: Compositional Human Motion Generation with Multi-modal Agents

We proposed a multi-modal based, compositional human motion generation framework (CoMA) to refine complex human motion generations from textual descriptions. Both quantitative and qualitative results demonstrate that CoMA can generate more realistic and diverse human motions than state-of-the-art methods. Finally, we submited our paper to The IEEE/CVF International Conference on Computer Vision (ICCV) 2025.

View publication | View project page

Accelerating Style Transfer: Enhancing Efficiency of Diffusion-based Models with Advanced Sampling Methods.

I independently conducted this research to explore recent advancements in accelerated sampling techniques for improving the efficiency of image generation in state-of-the-art style transfer models. The final results demonstrate that UniPC significantly increased the efficiency of high-quality stylized image generation fivefold in comparison with existing methods.

View Publication

Projects

Deep Learning Based AI Face Recognition System

Independent Study

I developed a face recognition system by collecting 8,000 facial images using OpenCV and processing them for feature extraction and data normalization. The dataset was carefully split into 6,266 training images and 1,567 testing images. I then built, trained, and optimized a convolutional neural network (CNN) using TensorFlow, achieving a high recognition accuracy of 91%. To demonstrate practical application, I integrated the model into a web-based login system built with Flask, enabling real-time user sign-up and authentication through facial recognition.

Financial Fraud Detection Based on Dynamic Graph Representation Learning

Group Member

we proposed a novel model architecture that integrates spatial and temporal aggregation within graph structures. Experimental results demonstrate that this design effectively captures the spatiotemporal features of dynamic graph data, achieving superior performance compared to most existing dynamic graph neural networks.

YOLOv3-based Object Detection in Complex Weather Conditions

Group Leader

We reproduced the methods from “Image-Adaptive YOLO (IA-YOLO) for Object Detection in Adverse Weather Conditions,” which introduces a differentiable image processing (DIP) module for the YOLO detector, with parameters dynamically predicted by a convolutional neural network (CNN-PP). We changed YOLO module to YOLOv3 and designed a channel attention mechanism to combine different image features. The results confirmed that integrating channel attention mechanism and improving network architecture enable the network to adaptively select appropriate filters for image preprocessing, thereby improving detection performance in challenging environments.

Deep Learning and Neural Networks: Theory and Applications

Group Leader

In this comprehensive project, I explored a series of deep learning tasks across classification, segmentation, and generative modeling. I began by comparing logistic regression and feedforward neural networks on the handwritten digit recognition task, finding that the combination of a feedforward neural network and the Adam optimizer yielded the fastest convergence and highest accuracy. I then delved into image segmentation, where I trained a U-Net model using various regularization strategies and discovered that L2 regularization effectively mitigated overfitting, achieving a DICE score of 0.993. Expanding into generative models, I implemented text-to-image generation using diffusion models, replicating experiments from Scalable Diffusion Models with Transformers to validate their effectiveness. Additionally, I led a team to enhance the CycleGAN architecture by improving its U-Net backbone, successfully replicating the artistic styles of Van Gogh, Monet, and Ukiyo-e. These subprojects collectively deepened my understanding of model optimization, generalization, and creativity in AI.

Design and Implementation of 3D Human Body Reconstruction System

Graduation Project

This study proposes a single-image 3D human body reconstruction method based on diffusion models, aiming to reduce hardware requirements and improve accessibility compared to multi-view or 3D scanning approaches. Leveraging SMPL-X prior guidance, the method uses a multi-view diffusion model to synthesize RGB images and normal maps from a single frontal image. A Vision Transformer (ViT)-based encoder extracts global features, while a multi-view decoding Transformer fuses cross-view information to enhance representation of texture and geometry. A dual-branch feed-forward network then learns implicit geometric and color fields, from which high-quality 3D meshes are extracted using the Marching Cubes algorithm. Quantitative and qualitative evaluations on the THuman2.0 and CAPE datasets demonstrate the method’s strong performance in geometry accuracy, texture consistency, and structural completeness, validating its effectiveness and generalizability.

Shenghan Zhou

Graduate Student

About Me

Education

Johns Hopkins University

Master of Science in Engineering in Computer Science

Chongqing University

Bachelor of Engineering in Artificial Intelligence

University of California, Irvine

Program: UCInspire

University of California, Berkeley

Program: Berkeley Global Access (BGA) Program

Publication

CoMA: Compositional Human Motion Generation with Multi-modal Agents

Accelerating Style Transfer: Enhancing Efficiency of Diffusion-based Models with Advanced Sampling Methods.

Projects

Deep Learning Based AI Face Recognition System

Independent Study

Financial Fraud Detection Based on Dynamic Graph Representation Learning

Group Member

YOLOv3-based Object Detection in Complex Weather Conditions

Group Leader

Deep Learning and Neural Networks: Theory and Applications

Group Leader

Design and Implementation of 3D Human Body Reconstruction System

Graduation Project

Skills

Honors & Awards

Get in Touch