Lecturer Vincent Koc

✓ ORCID VERIFIED

DisciplineComputer Science
University Massachusetts Institute of Technology (MIT)
CountryUnited States

About

Vincent Koc specializes in AI, language technologies, and cross-cultural research, with 20+ years spanning academia and industry globally (MIT, UT Austin, Microsoft). Founder of Hyperthink Labs, he leads projects at the intersection of AI and social good, advocating responsible, inclusive tech informed by his multilingual, multinational background and ethics-driven approach to artificial intelligence.

Vincent is currently a Lead AI researcher at Comet ML alongside his academic positions where he works in open-source AI evaluation, telemetry and optimization.

Areas of Interest

Vincent currently focuses on LLM (large language model) evaluation multimodal AI as well as prompt and agent optimization.

Dual-Stream Contrastive Latent Learning Generative Adversarial Network for Brain Image Synthesis and Tumor Classification

Journal of Imaging

Generative adversarial networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution, which is critical in image synthesis. To address this challenge, we propose a dual-stream contrastive latent projection generative adversarial network (DSCLPGAN) for the robust augmentation of MRI images. The dual-stream generator in our architecture incorporates two specialized processing pathways: one is dedicated to local feature variation modeling, while the other captures global structural transformations, ensuring a more comprehensive synthesis of medical images. We used a transformer-based encoder–decoder framework for contextual coherence and the contrastive learning projection (CLP) module integrates contrastive loss into the latent space for generating diverse image samples. The generated images undergo adversarial refinement using an ensemble of specialized discriminators, where discriminator 1 (D1) ensures classification consistency with real MRI images, discriminator 2 (D2) produces a probability map of localized variations, and discriminator 3 (D3) preserves structural consistency. For validation, we utilized a publicly available MRI dataset which contains 3064 T1-weighted contrast-enhanced images with three types of brain tumors: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). The experimental results demonstrate state-of-the-art performance, achieving an SSIM of 0.99, classification accuracy of 99.4% for an augmentation diversity level of 5, and a PSNR of 34.6 dB. Our approach has the potential of generating high-fidelity augmentations for reliable AI-driven clinical decision support systems.

Authors: Junaid Zafar, Vincent Koc, Haroon Zafar

DOI: https://doi.org/10.3390/jimaging11040101

Publish Year: 2025

Predictive and Reinforcement Learning-Based Framework for Cloud Resource Optimization

IEEE

The article presents a new optimization framework for dynamically managing resources in a cloud environment that combines predictive analytics and RL to optimally manage operational cost, system performance, and compliance with SLAs in multi-tenant cloud settings. Our framework comprises a predictive tier featuring resource demand forecasting based on regression models and an optimization tier that employs real-time resource allocation using Deep Q-Networks (DQN). Experiments conducted with real-world data from Microsoft Azure and MIT’s Supercloud have shown that the framework reduces over-provisioning and SLA violations while improving cost-efficiency. The automated dynamic resource allocation strategy proposed in this study outperforms traditional static allocation methods by reducing resource wastage and enhancing SLA compliance, demonstrating the viability of the approach in sophisticated multi-tenant cloud environments. This blended approach improves the efficiency of resource utilization while maintaining flexible and economical resource control. The findings accelerate the integration of AI-powered models, such as predictive analytics and reinforcement learning, with cloud resource management in response to the evolving challenges of cloud infrastructure complexity. We show that the proposed framework addresses the problems of achieving high-performance, scalable, and cost-effective optimization of cloud resources.

Authors: Vincent Koc, Vamsidhar Reddy Kamanuru

DOI: https://doi.org/10.1109/sieds65500.2025.11021093

Publish Year: 2025

Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges

arXiv (Cornell University)

The global crisis of language endangerment meets a technological turning point as Generative AI (GenAI) and Large Language Models (LLMs) unlock new frontiers in automating corpus creation, transcription, translation, and tutoring. However, this promise is imperiled by fragmented practices and the critical lack of a methodology to navigate the fraught balance between LLM capabilities and the profound risks of data scarcity, cultural misappropriation, and ethical missteps. This paper introduces a novel analytical framework that systematically evaluates GenAI applications against language-specific needs, embedding community governance and ethical safeguards as foundational pillars. We demonstrate its efficacy through the Te Reo Māori revitalization, where it illuminates successes, such as community-led Automatic Speech Recognition achieving 92% accuracy, while critically surfacing persistent challenges in data sovereignty and model bias for digital archives and educational tools. Our findings underscore that GenAI can indeed revolutionize language preservation, but only when interventions are rigorously anchored in community-centric data stewardship, continuous evaluation, and transparent risk management. Ultimately, this framework provides an indispensable toolkit for researchers, language communities, and policymakers, aiming to catalyze the ethical and high-impact deployment of LLMs to safeguard the world's linguistic heritage.

Authors: Vincent Koc

DOI: https://doi.org/10.48550/arxiv.2501.11496

Publish Year: 2025

Dual-Stream Contrastive Latent Learning GAN for Brain Image Synthesis and Tumor Classification

Preprints.org

Generative Adversarial Networks (GANs) prioritize pixel-level attributes over capturing the entire image distribution which is critical in image synthesis. To address this challenge, we propose (DSCLPGAN) a dual-stream generator coupled with contrastive latent projection (CLP) for the robust augmentation of MRI images. The dual- stream generator in our architecture incorporates two specialized processing pathways: one dedicated to local feature variation modeling, while the other captures global structural transformations, ensuring a more comprehensive synthesis of medical images. We used a transformer-based encoder-decoder framework for contextual coherence and the contrastive learning projection (CLP) module integrates contrastive loss into the latent space for generating diverse image samples. The generated images undergo adversarial refinement using an ensemble of specialized discriminators where discriminator 1 (D1) ensures classification consistency with real MRI images, discriminator 2 (D2) produces a probability map of localized variations and discriminator (D3) functions for preserving structural consistency. For validation, we utilize a publicly available MRI dataset which contains from 3064 T1-weighted contrast- enhanced images with three types of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). Experimental results demonstrate state-of-the-art performance, achieving an SSIM of 0.99, classification accuracy of 99.4% for an at an augmentation diversity level of 5 and PSNR of 34. 6 dB. Our approach has the potential of generating high-fidelity augmentations for reliable AI-driven clinical decision support systems.

Authors: Junaid Zafar, Vincent Koc, Haroon Zafar

DOI: https://doi.org/10.20944/preprints202503.1235.v1

Publish Year: 2025

Mind the Metrics: Patterns for Telemetry-Aware In-IDE AI Application Development using Model Context Protocol (MCP)

Modern AI-driven development environments are destined to evolve into observability-first platforms by integrating real-time telemetry and feedback loops directly into the developer workflow. This paper introduces telemetry-aware IDEs driven by Model Context Protocol (MCP), a new paradigm for building software. We articulate how an IDE (integrated development environment), enhanced with an MCP client/server, can unify prompt engineering with live metrics, traces, and evaluations to enable iterative optimization and robust monitoring. We present a progression of design patterns: from local large language model (LLM) coding with immediate metrics feedback, to continuous integration (CI) pipelines that automatically refine prompts, to autonomous agents that monitor and adapt prompts based on telemetry. Instead of focusing on any single optimizer, we emphasize a general architecture (exemplified by the Model Context Protocol and illustrated through Comet's Opik MCP server implementation) that consolidates prompt and agent telemetry for the future integration of various optimization techniques. We survey related work in prompt engineering, AI observability, and optimization (e.g., Prompts-as-Programs, DSPy's MIPRO, Microsoft's PromptWizard) to position this approach within the emerging AI developer experience. This theoretical systems perspective highlights new design affordances and workflows for AI-first software development, laying a foundation for future benchmarking and empirical studies on optimization in these environments.

Authors: Vincent Koc, Jacques Verre, Douglas A. Blank, Abigail Morgan

DOI: https://doi.org/10.36227/techrxiv.174900584.46814645/v1

Publish Year: 2025

Mind the Metrics: Patterns for Telemetry-Aware In-IDE AI Application Development using the Model Context Protocol (MCP)

arXiv (Cornell University)

AI development environments are evolving into observability first platforms that integrate real time telemetry, prompt traces, and evaluation feedback into the developer workflow. This paper introduces telemetry aware integrated development environments (IDEs) enabled by the Model Context Protocol (MCP), a system that connects IDEs with prompt metrics, trace logs, and versioned control for real time refinement. We present design patterns for local prompt iteration, CI based optimization, and autonomous agents that adapt behavior using telemetry. Rather than focusing on a single algorithm, we describe an architecture that supports integration with frameworks like DSPy, PromptWizard, and Prompts as Programs. We demonstrate this through Opik, an open source MCP server for LLM telemetry, and position our approach within the emerging LLMOps ecosystem. This work lays a foundation for future research on prompt optimization, IDE agent tooling, and empirical benchmarking in telemetry rich AI development workflows.

Authors: Vincent Koc, Jacques Verre, Douglas A. Blank, Abigail Morgan

DOI: https://doi.org/10.48550/arxiv.2506.11019

Publish Year: 2025

SmartCert: A Multi-modal framework for automated guided vehicle screening

Pervasive and Mobile Computing

Authors: Xu Chen, Sandeep Kanta, Vincent Koc, Santhi Bharath Punati, Arif Hussain, Sunny Katyara

DOI: https://doi.org/10.1016/j.pmcj.2025.102127

Publish Year: 2025

Framework for Fairness in Machine Learning Using Detecting and Mitigating Bias in AI Algorithms

Authors: Vincent Koc

DOI: https://doi.org/10.1109/icbats66542.2025.11258168

Publish Year: 2025

Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM Evaluation

arXiv (Cornell University)

Tiny QA Benchmark++ (TQB++) presents an ultra-lightweight, multilingual smoke-test suite designed to give large-language-model (LLM) pipelines a unit-test style safety net dataset that runs in seconds with minimal cost. Born out of the tight feedback-loop demands building the Comet Opik prompt-optimization SDK, where waiting on heavyweight benchmarks breaks developer flow. TQB++ couples a 52-item English gold set (less than 20 kB) with a tiny synthetic-data generator pypi package built on provider-agnostic LiteLLM. The generator lets practitioners mint their own tiny packs in any language, domain, or difficulty, while ten ready-made packs already cover Arabic, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish. Every dataset ships with Croissant metadata and plug-and-play files for OpenAI-Evals, LangChain, and standard CI tools, so teams can drop deterministic micro-benchmarks directly into pull-request gates, prompt-engineering loops, and production dashboards without touching GPU budgets. A complete TQB++ run adds only a few seconds to pipeline latency yet reliably flags prompt-template errors, tokenizer drift, and fine-tuning side-effects long before full-scale suites like MMLU or BIG-Bench would finish configuring. The entire framework is released to accelerate continuous, resource-efficient quality assurance across the generative-AI ecosystem.

Authors: Vincent Koc

DOI: https://doi.org/10.48550/arxiv.2505.12058

Publish Year: 2025

wikipedia-2017-bm25

Hugging Face

Authors: Comet ML, Vincent Koc

DOI: https://doi.org/10.57967/hf/7073

Publish Year: 2025

ORCID VERIFIED Lecturer Vincent Koc Computer Science
Massachusetts Institute of Technology (MIT)

LLM evaluation, prompt optimization, applied-AI in LLMs

Open 4 months, 1 week ago

My research is industry applied blended with academia and looks at a variety of areas spanning production AI systems focused on generative …

United States

View

No collaborations yet.

Your Account

Lecturer Vincent Koc

About

Areas of Interest

Journal of Imaging

IEEE

arXiv (Cornell University)

Preprints.org

arXiv (Cornell University)

Pervasive and Mobile Computing

arXiv (Cornell University)

Hugging Face