Ahmed Taha

Research Scientist


Research Interest: Feature Embedding, Metric Learning, Deep Networks, Machine Learning, Image segmentation, Texture classification, Patch matching.
Technical Skills: Python, C/C++, JAVA, OpenCV, MATLAB, mex files, and CUDA
TensorFlow, PyTorch, Keras, OpenCV, SimpleITK, CAFFE
Education: Computer Science PhD (+MS) GPA: 4.0/4.0 - University of Maryland
Masters of Business Administration (Marketing Major) GPA 3.83/4.0 - Arab Academy for Science and Technology
Computer Science BS GPA 3.81/4.0 - Alexandria University- Faculty of Engineering


(click on image to expand)

 We propose singular value maximization (SVMax) to learn a more uniform feature embedding.  The SVMax regularizer supports both supervised and unsupervised learning. Our formulation mitigates model collapse and enables larger learning rates.SVMax: A Feature Embedding Regularizer, arXiv 2021
Ahmed Taha, Alex Hanson, Abhinav Shrivastava, Larry Davis
Project Page
 we  propose  an evolution-inspired training approach to boost performance on relatively small datasets. The knowledge evolution approach splits a deep network into two hypotheses: the fit-hypothesis and the reset-hypothesis. We iteratively evolve the knowledge inside the fit-hypothesis by perturbing the reset-hypothesis for multiple generations.Knowledge Evolution in Neural Networks, CVPR Oral 2021 (acceptance rate 24% -- Oral 4%)
Ahmed Taha, Abhinav Shrivastava, Larry Davis
Project Page
 We formulate attention visualization as a constrained optimization problem. We leverage the unit L2-Norm constraint as an attention filter (L2-CAF) to localize attention in both classification and retrieval networks.A Generic Visualization Approach for Convolutional Neural Networks, ECCV 2020 (acceptance rate 27%)
Ahmed Taha, Xitong Yang, Abhinav Shrivastava, Larry Davis
Project Page
We extend standard architectures with an embedding head. We leverage a ranking regularizer to the embedding head to improve (1) classification accuracy and (2) feature embedding.Boosting Standard Classification Architectures Through a Ranking Regularizer, WACV 2020
Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis
Project Page
We introduce an unsupervised formulation to estimate heteroscedastic uncertainty in retrieval systems. We propose an extension to triplet loss that models data uncertainty for each input.Unsupervised Data Uncertainty Learning in Visual Retrieval Systems, arXiv 2019
Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis
arXiv Link
We cast visual retrieval as a regression problem by posing triplet loss as a regression loss. This enables epistemic uncertainty estimation using dropout as a Bayesian approximation framework in retrieval. Accordingly, Monte Carlo sampling is leveraged to boost retrieval performanceExploring Uncertainty in Conditional Multi-Modal Retrieval Systems, arXiv 2019
Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Larry Davis
arXiv Link
(a) An example of sequential residual network. (b) An example of directed acyclic graph. (c) The residual graph derived from (b). (d) The U-Net architecture. Downward arrows represent down-sampling or strided convolutions. Upward arrows represent up-sampling or deconvolution. (e) The Res. U-Net architecture. It consists of two threads. The first thread is scale-specific features: the green channels. It follows a similar architecture to the U-Net. The second thread is the residual architecture, including the red, the orange and the blue channels. Segmentation of Renal Structures for Image-Guided Surgery, MICCAI 2018
Junning Li, Pechin Lo, Ahmed Taha, Hang Wu, Tao Zhao
KID-Net architecture. The two contradicting phases are colored in blue. The down-sampling and up-sampling phases detect and localize features respectively. The segmentation result, at different scale levels, are averaged to compute the final segmentation.Kid-Net: Convolution Networks for Kidney Vessels Segmentation from CT-Volumes, MICCAI 2018
Ahmed Taha, Pechin Lo, Junning Li, Tao Zhao
Self-supervised learning framework formulated as a four-class classification problem. Given a tuple of a RGB frame and a stack of difference (SOD), the network reasons about frame ordering and spatio-temporal correspondence. Valid and invalid ordered motion are highlighted in green and red respectively. Class III shows a tuple of a weight lift- ing RGB frame and a SOD encoding a boxing action with a valid sequence – no spatio-temporal correspondence.Two Stream Self-Supervised Learning for Action Recognition CVPRW 2018
Ahmed Taha, Moustafa Meshry, Xitong Yang, Yi-Ting Chen, Larry Davis
(DeepVision)- Extended Abstract Github Code
Texture Synthesis with Recurrent Variational Auto-Encoder, ARXIV 2017
Rohan Chandra, Sachin Grover, Kyungjun Lee, Moustafa Meshry, Ahmed Taha
Github Code
Seeded Laplacian approach overview on Toy Image of size 48 × 48. (b)Toy image is perturbed with Gaussian noise and overlaid by seeded annotations (blue for background and yellow for fore- ground). (c)Setting a Zero threshold on the eigenfunction vector pro- duces (d) the final segmentation result.Seeded Laplacian: An Interactive Image Segmentation Approach using Eigenfunctions, ICIP 2015
Ahmed Taha, Marwan Torki
Github Code
The effect of discriminative embedding. Left: Image with provided user scribbles. Red for FG and blue for BG. Middle: 3D plot of the RGB channels for the provided scribbles. The scribbles are mixed in the RGB color space. Right: 3D plot of the first 3 dimensions of the our discriminative embedding. The color modalities present in the scribbles are preserved. Remark that the FG has two modalities namely skin color and jeans. Also, the BG has two modalities the sky and horse body.Multi-Modality Feature Transform: An Interactive Image Segmentation Approach, BMVC 2015
Moustafa Meshry, Ahmed Taha, Marwan Torki


Research Experience:

Selected Awards:

Teaching Experience:

(click to see course description)

After earning my Bsc, I spent some time developing mobile apps for iOS. I co-founded Inova, a software development company.