Available for opportunities

Ali Fouladgar

Cloud Infrastructure Architect | AI/ML Platform Engineering Specialist | Building scalable AI-driven platforms

10+ years Integration • AI • Cloud • Infrastructure
50+ Projects
10+ Journal Papers
400+ Citations
About

I have spent my career at the intersection of infrastructure, programmable networks, cloud, and AI. At Tata Communications, I led architecture and data-driven initiatives for global network services.

Senior Platform Engineer with 10+ years of experience in cloud and network architecture, 7+ years specializing in integration, automation, configuration tooling, hypervisor lifecycle, VM runtime and Kubernetes and 4+ years specializing in AI/ML platforms, MLOps, AI/ML.

Recently, I have built Bridgit Social — a real-time networking app that helps professionals connect in person based on context and goals. I'm also exploring agentic AI and RAG systems for recommendations and coaching.

My technical background spans AI/ML platforms, cloud architecture (AWS, GCP), network automation, and data engineering. I love connecting systems end-to-end, from low-level infrastructure to high-level AI products.

Projects

Featured work

Bridgit Social

Designed and deployed a real-time user-matching system using few-shot prompting and retrieval-augmented generation (RAG) to connect users by context, intent, and proximity. Reduced irrelevant matches and boosted engagement.

iOS Swift Firebase GCP APIs OpenAI Postgres pgvector LangChain

AI-Driven Network Diagnostics

Retrieval-augmented assistant on top of network logs, runbooks, and telemetry to accelerate incident triage. Reduced investigation time by surfacing likely root causes in natural language.

RAG LLMs MLOps Python Observability

LLM Research & Experimentation

Hands-on exploration of Large Language Models including fine-tuning, prompt engineering, and evaluation techniques. Collection of Jupyter notebooks demonstrating practical LLM applications and best practices.

LLMs Python Jupyter NLP AI Research Prompt Engineering

ONAP Network Orchestration

Contributed to Open Network Automation Platform (ONAP) for NFV orchestration and service automation. Led implementation projects following MEF 3.0 standards, achieving MEF certification and industry recognition.

ONAP NFV SDN MEF Network Automation OpenStack
Experience

Professional Experience

Founder & CTO | Full-Stack Software Developer

Bridgit Group LLC

  • Designed and developed an AI-powered, location-based social networking platform, Bridgit Social, from concept to MVP in under 6 months, integrating real-time recommendations with sub-1 second response time and reducing costs by 70%.
  • Directed product management, including market research, user interviews, design, prototyping, agile development, and securing a beta user base of 100+ and positioning the platform for pre-seed investments.
  • Engineered scalable, cloud-native backend infrastructure and optimized geospatial queries, enabling real-time matchmaking for 1K+ concurrent requests and preparing for horizontal scaling to support 100K+ users.

Associate Director | Data Scientist & Solutions Architect

Tata Communications

  • Built a GenAI-powered support automation platform using LLMs, RAG, and MLOps workflows, reducing operational cost 70% and improving ticket resolution time 90%.
  • Implemented CI/CD and automated pipelines for predictive fault detection and closed-loop remediation, minimizing downtime and enabling proactive operations.
  • Enabled IZO Multi-Cloud Connect across 6 regions and 30+ data centers, partnered with Red Hat solutions over Red Hat Openstack Platform (RHOSP), delivering a high-availability interconnect with sub-2ms latency and 100% uptime as a scalable foundation for enterprise AI/ML workloads.
  • Engineered high-availability Kubernetes cluster over Openstack for an Open Network Automation Platform (ONAP)-based orchestration platform with Ceph storage, Velero backups, and OVN-Kubernetes CNI.
  • Integrated and Optimized RHOSP networking with SR-IOV, Calico, and load balancers for low-latency workloads being used for IZO Multi-Cloud Connect supporting high-speed network interfaces in edge vRouters.
  • Architected Kubernetes-based Services Platform automating on-demand L2/L3 services, cutting provisioning time, achieving 10× operational efficiency gains, and enabling automation for 1,000+ network services globally.
  • Implemented an automated ONAP’s deployment on AWS EKS, by reducing Kubernetes deployment time using Terraform by 50%, enabling consistent Kubernetes environments, resource optimization, automated testing, and scalable infrastructure.
  • Established CI/CD pipelines and automated workflows for predictive fault detection, closed-loop remediation, and system optimization, minimizing downtime and enabling proactive network maintenance.

Senior Cloud & Network Architect

Tata Communications

  • Engineered a neural network (LSTM) model to forecast quarterly core network bandwidth needs with 92% training and 90% test accuracy, saving $500K annually through proactive capacity planning.
  • Developed a Python-based automation SDK for network testing that reduced test case development time by 80% and standardized testing across 5+ engineering teams.
  • Optimized ONAP NaaS platform deployments to meet strict scalability, redundancy, and high-availability requirements, enabling 1-hour failover capabilities and avoiding potential SLA penalties exceeding $1M.
  • Accelerated delivery of network automation features by cutting QA turnaround from 2 weeks to 2 days, contributing to a 20% faster time-to-market.
  • Delivered cross-functional leadership to a platform support team in providing timely technical assistance and troubleshooting for production issues to 5+ engineering, service delivery, and operation teams.
  • Maintained and improved VM runtime workflows.
  • Integrated and configured ONAP for Day-2 management (Cluster Logging, Monitoring) with ELK for centralized logging and SRE observability (over Kubernetes) and faster issue resolution using Prometheus, Grafana, and etcd diagnostics.
  • Deployed ONAP over Kubernetes on bare-metal with BMC automation and Kickstart for RHEL provisioning.

Cloud & Network Engineer

Tata Communications

  • Implemented OpenStack-based private cloud increasing compute utilization 70% and saving $250K annually.
  • Deployed Kubernetes/Docker orchestration and VM consolidation, enabling scalable test and ML research environments.
  • Researched Ethernet backbone traffic optimization using Provider Backbone Bridging (PBB), delivering recommendations that improved throughput efficiency by 15%.
  • Pioneered system engineering and deployment for a hypervisor-based VM platform on OpenStack, optimizing power and computing utilization by 70% and reducing costs by $250K/Y, while enabling scalable infrastructure to support AI/ML testing workloads.
  • Represented the company at MEF, Cloud Native Computing Foundation (CNCF) and Linux Foundation, presenting results from 4+ POCs, influencing vendor and partner strategies, and positioning the company as a leader in open-source cloud and network automation innovation.
Education

Education

Ph.D., Electrical Engineering

New Jersey Institute of Technology

M.S., B.S., Electrical Engineering

Isfahan University of Technology

Skills

Technical expertise

From low-level network infrastructure to high-level ML products, I enjoy connecting systems end to end.

GenAI & LLM Systems

RAG Prompt Engineering LLM Fine-tuning Transformers (Hugging Face) Vector Databases (Pinecone, FAISS) LangChain LangGraph Semantic Search Real-Time Inference

Machine Learning & MLOps

PyTorch TensorFlow Keras Scikit-learn Pandas MLflow Kubeflow SageMaker CI/CD for ML Distributed ML Pipelines Neural Networks NLP

Cloud & Distributed Systems

AWS (Infra services, Lambda, S3, Kinesis, SageMaker) GCP (Firebase, Firestore, AI/ML) Kubernetes & Docker Terraform Ansible OpenStack (RHOSP) VMware (NSX, ESXi, vCenter) RHEL Helm Distributed System Design High Availability Architectures

Data Engineering & Analytics

Spark (PySpark) Hadoop SQL NoSQL Vector Databases Kafka Stream Processing Pandas ETL Pipelines Data Warehousing Python SDKs Analytics Pipelines Predictive Modeling

Programming, Automation & Integration

Python Java SwiftUI SQL LangChain LangGraph CrewAI Git GitHub GitLab SDN, NFV, ONAP, OpenStack Testing & QA Automation n8n & Zapier Bare metal (PXE, MAAS) BMC

Networking

VXLAN EVPN Multi-Cloud Connect MPLS SD-WAN L2/L3 Orchestration CNI plugins (OVN-K, Calico) Policy-Driven Networking

Monitoring & Observability

ELK stack Grafana Prometheus Consul etcd troubleshooting

Publications

Research & Publications

Selected publications in information theory, wireless communications, and energy transfer.

Constrained codes for joint energy and information transfer

A Fouladgar, O Simeone, E Erkip

IEEE Transactions on Communications

2014
66 citations

Interactive Joint Transfer of Energy and Information

A Fouladgar, O Simeone

IEEE Transactions on Communications

2013
152 citations

On the Transfer of Information and Energy in Multi-User Systems

A Fouladgar, O Simeone

IEEE Communications Letters

2012
212 citations
Contact

Let's work together

Open to new opportunities, collaborations, and conversations. Reach out through any of the channels below.