FAANGineering - BlogFlock2025-05-09T14:21:59.950ZBlogFlockGoogle Developers Blog, The GitHub Blog, Nextdoor Engineering - Medium, Engineering at Meta, Netflix TechBlog - Medium, Etsy Engineering | Code as CraftAdvancing the frontier of video understanding with Gemini 2.5 - Google Developers Bloghttps://developers.googleblog.com/en/gemini-2-5-video-understanding/2025-05-09T13:20:55.000ZGemini 2.5 marks a major leap in video understanding, achieving state-of-the-art performance on key video understanding benchmarks and being able to seamlessly use audio-visual information with code and other data formats.Gemini 2.5 Models now support implicit caching - Google Developers Bloghttps://developers.googleblog.com/en/gemini-2-5-models-now-support-implicit-caching/2025-05-08T17:50:55.000ZThe rollout of implicit caching in the Gemini API expands on the existing explicit caching API, providing an "always on" caching system which offers automatic cost savings to developers using Gemini 2.5 models and continued availability of the explicit caching API for guaranteed savings.Accelerating GPU indexes in Faiss with NVIDIA cuVS - Engineering at Metahttps://engineering.fb.com/?p=224232025-05-08T17:00:22.000Z<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating</span> <a href="https://github.com/rapidsai/cuvs" target="_blank" rel="noopener"><span style="font-weight: 400;">NVIDIA cuVS</span></a><span style="font-weight: 400;"> into</span><a href="https://github.com/facebookresearch/faiss/releases/tag/v1.10.0" target="_blank" rel="noopener"> <span style="font-weight: 400;">Faiss v1.10</span></a><span style="font-weight: 400;">, Meta’s open source library for similarity search.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">This new implementation of cuVS will be more performant than classic GPU-accelerated search in some areas.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">For inverted file (IVF) indexing, NVIDIA cuVS outperforms classical GPU-accelerated IVF build times by up to 4.7x; and search latency is reduced by as much as 8.1x.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">For graph indexing, CUDA ANN Graph (CAGRA) outperforms CPU Hierarchical Navigable Small World graphs (HNSW) build times by up to 12.3x; and search latency is reduced by as much as 4.7x.</span></li>
</ul>
<h1><span style="font-weight: 400;">The Faiss library</span></h1>
<p><span style="font-weight: 400;">The</span> <a href="https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/" target="_blank" rel="noopener"><span style="font-weight: 400;">Faiss library</span></a><span style="font-weight: 400;"> is an open source library, developed by Meta FAIR, for efficient vector search and clustering of dense vectors. Faiss pioneered vector search on GPUs, as well as the ability to seamlessly switch between GPUs and CPUs. It has made a lasting impact in both research and industry, being used as an integrated library in several databases (e.g., Milvus and OpenSearch), machine learning libraries, data processing libraries, and AI workflows. Faiss is also used heavily by researchers and data scientists as a standalone library, often</span> <a href="https://github.com/facebookresearch/faiss/pull/1484" target="_blank" rel="noopener"><span style="font-weight: 400;">paired with PyTorch</span></a><span style="font-weight: 400;">. </span></p>
<h1><span style="font-weight: 400;">Collaboration with NVIDIA</span></h1>
<p><span style="font-weight: 400;">Three years ago, Meta and NVIDIA worked together to enhance the capabilities of vector search technology and to accelerate vector search on GPUs. Previously, in 2016, Meta had incorporated high performing vector search algorithms made for NVIDIA GPUs: </span><span style="font-weight: 400; font-family: 'courier new', courier;">GpuIndexFlat</span><span style="font-weight: 400;">; </span><span style="font-weight: 400; font-family: 'courier new', courier;">GpuIndexIVFFlat</span><span style="font-weight: 400;">; </span><span style="font-weight: 400; font-family: 'courier new', courier;">GpuIndexIVFPQ</span><span style="font-weight: 400;">. After the partnership, NVIDIA rapidly contributed</span> <a href="https://arxiv.org/abs/2308.15136" target="_blank" rel="noopener"><span style="font-weight: 400;">GpuIndexCagra</span></a><span style="font-weight: 400;">, a state-of-the art graph-based index designed specifically for GPUs. In its latest release, </span><a href="https://github.com/facebookresearch/faiss/releases/tag/v1.10.0" target="_blank" rel="noopener"><span style="font-weight: 400;">Faiss 1.10.0</span></a><span style="font-weight: 400;"> officially includes these algorithms from the </span><a href="https://github.com/rapidsai/cuvs" target="_blank" rel="noopener"><span style="font-weight: 400;">NVIDIA cuVS library</span></a><span style="font-weight: 400;">. </span></p>
<p><span style="font-weight: 400;">Faiss 1.10.0 also includes a </span><a href="https://anaconda.org/pytorch/faiss-gpu-cuvs" target="_blank" rel="noopener"><span style="font-weight: 400;">new conda package</span></a><span style="font-weight: 400;"> that unlocks the ability to choose between the classic Faiss GPU implementations and the newer </span><a href="https://github.com/facebookresearch/faiss/wiki/GPU-Faiss-with-cuVS-usage" target="_blank" rel="noopener"><span style="font-weight: 400;">NVIDIA cuVS algorithms</span></a><span style="font-weight: 400;">, making it easy for users to switch between GPU and CPU.</span></p>
<h1><span style="font-weight: 400;">Benchmarking</span></h1>
<p><span style="font-weight: 400;">The following benchmarks were conducted using the </span><a href="https://docs.rapids.ai/api/cuvs/nightly/cuvs_bench/" target="_blank" rel="noopener"><span style="font-weight: 400;">cuVS-bench</span></a><span style="font-weight: 400;"> tool. </span></p>
<p><span style="font-weight: 400;">We measured:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A tall, slender image dataset: A subset of 100 million vectors from the </span><a href="https://research.yandex.com/blog/benchmarks-for-billion-scale-similarity-search" target="_blank" rel="noopener"><span style="font-weight: 400;">Deep1B</span></a><span style="font-weight: 400;"> dataset by 96 dimensions.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A short, wide dataset of text embeddings: </span><a href="https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file#benchmark-cases" target="_blank" rel="noopener"><span style="font-weight: 400;">5 million vector embeddings,</span></a><span style="font-weight: 400;"> curated using the </span><a href="https://openai.com/index/new-and-improved-embedding-model/" target="_blank" rel="noopener"><span style="font-weight: 400;">OpenAI text-embedding-ada-002 model</span></a><span style="font-weight: 400;">.</span></li>
</ul>
<p><span style="font-weight: 400;">Tests for index build times and search latency were conducted on an </span><a href="https://www.nvidia.com/en-us/data-center/h100/" target="_blank" rel="noopener"><span style="font-weight: 400;">NVIDIA H100 GPU</span></a><span style="font-weight: 400;"> and compared to an Intel Xeon Platinum 8480CL system. Results are reported in the tables below at 95% recall along the</span><a href="https://docs.rapids.ai/api/cuvs/nightly/comparing_indexes/" target="_blank" rel="noopener"> <span style="font-weight: 400;">pareto frontiers</span><span style="font-weight: 400;"> for k=10 nearest neighbors. </span></a></p>
<h2><span style="font-weight: 400;">Build time (95% recall@10)</span></h2>
<table style="height: 229px;" border="1" width="769">
<tbody>
<tr>
<td style="text-align: center;" colspan="2" bgcolor="efefef"><b>Index</b></td>
<td colspan="2" bgcolor="efefef">
<p style="text-align: center;"><b>Embeddings<br />
</b><b>100M x 96<br />
</b><b>(seconds)</b></p>
</td>
<td colspan="2" bgcolor="efefef">
<p style="text-align: center;"><b>Embeddings<br />
</b><b>5M x 1536<br />
</b><b>(seconds)</b></p>
</td>
</tr>
<tr>
<td><b>Faiss Classic</b></td>
<td><b>Faiss cuVS</b></td>
<td><b>Faiss Classic</b></td>
<td><b> Faiss cuVS</b></td>
<td><b>Faiss Classic</b></td>
<td><b>Faiss cuVS</b></td>
</tr>
<tr>
<td><span style="font-weight: 400;">IVF Flat</span></td>
<td><span style="font-weight: 400;">IVF Flat</span></td>
<td><span style="font-weight: 400;">101.4</span></td>
<td><b>37.9 </b><span style="font-weight: 400;">(2.7x)</span></td>
<td><span style="font-weight: 400;">24.4</span></td>
<td><b>15.2</b><span style="font-weight: 400;"> (1.6x)</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">IVF PQ</span></td>
<td><span style="font-weight: 400;">IVF PQ</span></td>
<td><span style="font-weight: 400;">168.2</span></td>
<td><b>72.7</b><span style="font-weight: 400;"> (2.3x)</span></td>
<td><span style="font-weight: 400;">42.0</span></td>
<td><b>9.0</b><span style="font-weight: 400;"> (4.7x)</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">HNSW (CPU)</span></td>
<td><span style="font-weight: 400;">CAGRA</span></td>
<td><span style="font-weight: 400;">3322.1</span></td>
<td><b>518.5</b><span style="font-weight: 400;"> (6.4x)</span></td>
<td><span style="font-weight: 400;">1106.1</span></td>
<td><b>89.7</b><span style="font-weight: 400;"> (12.3x)</span></td>
</tr>
</tbody>
</table>
<p><i><span style="font-weight: 400;">Table 1: Index build times for Faiss-classic and Faiss-cuVS in seconds (with NVIDIA cuVS speedups in parentheses).</span></i></p>
<h3><span style="font-weight: 400;">Search latency (95% recall@10)</span></h3>
<table style="height: 220px;" width="763">
<tbody>
<tr>
<td style="text-align: center;" colspan="2" bgcolor="efefef"><b>Index</b></td>
<td colspan="2" bgcolor="efefef">
<p style="text-align: center;"><b>Embeddings<br />
</b><b>100M x 96<br />
</b><b>(milliseconds)</b></p>
</td>
<td colspan="2" bgcolor="efefef">
<p style="text-align: center;"><b>Embeddings<br />
</b><b>5M x 1536<br />
</b><b>(milliseconds)</b></p>
</td>
</tr>
<tr>
<td><b>Faiss Classic</b></td>
<td><b>Faiss cuVS</b></td>
<td><b>Faiss Classic</b></td>
<td><b>Faiss cuVS</b></td>
<td><b>Faiss Classic</b></td>
<td><b>Faiss cuVS</b></td>
</tr>
<tr>
<td><span style="font-weight: 400;">IVF Flat</span></td>
<td><span style="font-weight: 400;">IVF Flat</span></td>
<td><span style="font-weight: 400;">0.75</span></td>
<td><b>0.39 </b><span style="font-weight: 400;">(1.9x)</span></td>
<td><span style="font-weight: 400;">1.98</span></td>
<td><b>1.14</b><span style="font-weight: 400;"> (1.7x)</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">IVF PQ</span></td>
<td><span style="font-weight: 400;">IVF PQ</span></td>
<td><span style="font-weight: 400;">0.49</span></td>
<td><b>0.17</b><span style="font-weight: 400;"> (2.9x)</span></td>
<td><span style="font-weight: 400;">1.78</span></td>
<td><b>0.22</b><span style="font-weight: 400;"> (8.1x)</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">HNSW (CPU)</span></td>
<td><span style="font-weight: 400;">CAGRA</span></td>
<td><span style="font-weight: 400;">0.56</span></td>
<td><b>0.23</b><span style="font-weight: 400;"> (2.4x)</span></td>
<td><span style="font-weight: 400;">0.71</span></td>
<td><b>0.15</b><span style="font-weight: 400;"> (4.7x)</span></td>
</tr>
</tbody>
</table>
<p><i><span style="font-weight: 400;">Table 2: Online (i.e., one at a time) search query latency for Faiss-classic and Faiss-cuVS in milliseconds (with NVIDIA cuVS speedups in parentheses).</span></i></p>
<h2><span style="font-weight: 400;">Looking forward</span></h2>
<p><span style="font-weight: 400;">The emergence of state-of-the-art NVIDIA GPUs has revolutionized the field of vector search, enabling high recall and lightning-fast search speeds. The integration of Faiss and cuVS will continue to incorporate state-of-the-art algorithms, and we look forward to unlocking new innovations in this partnership between Meta and NVIDIA. </span></p>
<p><span style="font-weight: 400;">Read here for </span><a href="https://developer.nvidia.com/cuvs" target="_blank" rel="noopener"><span style="font-weight: 400;">more details about NVIDIA cuVS</span></a><span style="font-weight: 400;">.</span></p>
<p>The post <a rel="nofollow" href="https://engineering.fb.com/2025/05/08/data-infrastructure/accelerating-gpu-indexes-in-faiss-with-nvidia-cuvs/">Accelerating GPU indexes in Faiss with NVIDIA cuVS</a> appeared first on <a rel="nofollow" href="https://engineering.fb.com">Engineering at Meta</a>.</p>
Measuring Dialogue Intelligibility for Netflix Content - Netflix TechBlog - Mediumhttps://medium.com/p/58c13d2a6f6e2025-05-08T00:40:17.000Z<p><em>Enhancing Member Experience Through Strategic Collaboration</em></p><p><a href="https://www.linkedin.com/in/ozziesutherland/">Ozzie Sutherland</a>, <a href="https://www.linkedin.com/in/iroroorife/">Iroro Orife</a>, <a href="https://www.linkedin.com/in/chih-wei-wu-73081689/">Chih-Wei Wu</a>, <a href="https://www.linkedin.com/in/bhanusrikanth/">Bhanu Srikanth</a></p><p>At Netflix, delivering the best possible experience for our members is at the heart of everything we do, and we know we can’t do it alone. That’s why we work closely with a diverse ecosystem of technology partners, combining their deep expertise with our creative and operational insights. Together, we explore new ideas, develop practical tools, and push technical boundaries in service of storytelling. This collaboration not only empowers the talented creatives working on our shows with better tools to bring their vision to life, but also helps us innovate in service of our members. By building these partnerships on trust, transparency, and shared purpose, we’re able to move faster and more meaningfully, always with the goal of making our stories more immersive, accessible, and enjoyable for audiences everywhere. One area where this collaboration is making a meaningful impact is in improving dialogue intelligibility, from set to screen. We call this the Dialogue Integrity Pipeline.</p><h4>Dialogue Integrity Pipeline</h4><p>We’ve all been there, settling in for a night of entertainment, only to find ourselves straining to catch what was just said on screen. You’re wrapped up in the story, totally invested, when suddenly a key line of dialogue vanishes into thin air. “Wait, what did they say? I can’t understand the dialogue! What just happened?”</p><p>You may pick up the remote and rewind, turn up the volume, or try to stay with it and hope this doesn’t happen again. Creating sophisticated, modern series and films requires an incredible artistic & technical effort. At Netflix, we strive to ensure those great stories are easy for the audience to enjoy. Dialogue intelligibility can break down at multiple points in what we call the <strong>Dialogue Integrity Pipeline</strong>, the journey from on-set capture to final playback at home. Many facets of the process can contribute to dialogue that’s difficult to understand:</p><ul><li>Naturalistic acting styles, diverse speech patterns, and accents</li><li>Noisy locations, microphone placement problems on set</li><li>Cinematic (high dynamic range) mixing styles, excessive dialogue processing, substandard equipment</li><li>Audio compromises through the distribution pipeline</li><li>TVs with inadequate speakers, noisy home environments</li></ul><p>Addressing these issues is critical to maintaining the standard of excellence our content deserves.</p><h4>Measurement at Scale</h4><p>Netflix utilizes industry-standard loudness meters to measure content and its adherence to our core loudness specifications. This tool also provides feedback on audio dynamic range (loud to soft) which impacts dialogue intelligibility. The Audio Algorithms team at Netflix wanted to take these measurements further and develop a holistic understanding of dialogue intelligibility throughout the runtime of a given title.</p><p>The team developed a Speech Intelligibility measurement system based on the Short-time Objective Intelligibility (STOI) metric [<a href="https://www.researchgate.net/profile/Cees-Taal/publication/224219052_An_Algorithm_for_Intelligibility_Prediction_of_Time-Frequency_Weighted_Noisy_Speech/links/0deec51da9fbbc5eea000000/An-Algorithm-for-Intelligibility-Prediction-of-Time-Frequency-Weighted-Noisy-Speech.pdf">Taal et al.</a> (IEEE <em>Transactions on Audio, Speech, and Language Processing</em>)]. Firstly, a speech activity detector analyses the dialogue stem to render speech utterances, which are then compared to non-speech sounds in the mix, typically Music and Effects. Then the system calculates the Signal-to-Noise ratio, in each speech frequency band, the results of which are summarized succinctly, per-utterance on the range [0, 1.0], to quantify the degree to which competing Music and Effects can distract the listener.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*WSViFfuvT8pcZshi" /><figcaption>This chart shows how eSTOI (extended Short-Time Objective Intelligibility) method measures dialogue (fg [foreground] stem in the graphic) against non-speech (bg [background] stem in the graphic) to judge intelligibility based on competing non-speech sound.</figcaption></figure><h4>Optimizing Dialogue Prior to Delivery</h4><p>Understanding dialogue intelligibility across Netflix titles is invaluable, but our mission goes beyond analysis — we strive to empower creators with the tools to craft mixes that resonate seamlessly with audiences at home.</p><p>Seeing the lack of dedicated Dialogue Intelligibility Meter plugins for Digital Audio Workstations, we teamed up with industry leaders, Fraunhofer Institute for Digital Media Technology IDMT (Fraunhofer IDMT) and Nugen Audio to pioneer a solution that enhances creative control and ensures crystal-clear dialogue from mix to final delivery.</p><p>We collaborated with Fraunhofer IDMT to adapt their machine-learning-based speech intelligibility solution for cross-platform plugin standards and brought in Nugen Audio to develop DAW-compatible plugins.</p><h4>Fraunhofer IDMT</h4><p>The Fraunhofer Department of Hearing, Speech, and Audio Technology HSA has done significant research and development on media processing tools that measure speech intelligibility. In 2020, the machine learning-based method was integrated into Steinberg’s Nuendo Digital Audio Workstation. We approached the Fraunhofer engineering team with a collaboration proposal to make their technology accessible to other audio workstations through the cross-platform VST (Virtual Studio Technology) and AAX (Avid Audio Extension) plugin standards. The scientists were keen on the project and provided their dialogue intelligibility library.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*wuapXe2lajcx3tTj" /><figcaption>The Fraunhofer IDMT Dialogue Intelligibility Meter integrated into the Steinberg Nuendo Digital Audio Workstation.</figcaption></figure><h4>Nugen Audio</h4><p>Nugen Audio created the VisLM plugin to provide sound teams with an efficient and accurate way to measure mixes for conformance to traditional broadcast & streaming specifications — Full Mix Loudness, Dialogue Loudness, and True Peak. Since then, VisLM has become a widely used tool throughout the global post-production industry. Nugen Audio partnered with Fraunhofer, integrating the Fraunhofer IDMT Dialogue Intelligibility libraries into a new industry-first tool — Nugen DialogCheck. This tool gives <strong>re-recording mixers</strong> real-time insights, helping them adjust dialogue clarity at the most crucial points in the mixing process, ensuring every word is clear and understood.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*gGt-DpKR806J2jqT" /></figure><h4>Clearer Dialogue Through Collaboration</h4><p>Crafting crystal-clear dialogue isn’t just a technical challenge — it’s an art that requires continuous innovation and strong industry collaboration. To empower creators, Netflix and its partners are embedding advanced intelligibility measurement tools directly into DAWs, giving sound teams the ability to:</p><ul><li>Detect and resolve dialogue clarity issues early in the mix.</li><li>Fine-tune speech intelligibility without compromising artistic intent.</li><li>Deliver immersive, accessible storytelling to every viewer, in any listening environment.</li></ul><p>At Netflix, we’re committed to pushing the boundaries of audio excellence. From pioneering the eSTOI (extended short-term objective intelligibility) method to collaborating with Fraunhofer and Nugen Audio on cutting-edge tools like the DialogCheck Plugin, we’re setting a new standard for dialogue clarity — ensuring every word is heard exactly as creators intended. But innovation doesn’t happen in isolation. By working together with our partners, we can continue to push the limits of what’s possible, fueling creativity and driving the future of storytelling.</p><p>Finally, we’d like to extend a heartfelt thanks to Scott Kramer for his contributions to this initiative.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=58c13d2a6f6e" width="1" height="1" alt=""><hr><p><a href="https://netflixtechblog.com/measuring-dialogue-intelligibility-for-netflix-content-58c13d2a6f6e">Measuring Dialogue Intelligibility for Netflix Content</a> was originally published in <a href="https://netflixtechblog.com">Netflix TechBlog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>Create and edit images with Gemini 2.0 in preview - Google Developers Bloghttps://developers.googleblog.com/en/generate-images-gemini-2-0-flash-preview/2025-05-07T16:55:56.000ZGemini 2.0 Flash's image generation capabilities, now available in preview in Google AI Studio and Vertex AI, feature higher rate limits, enhanced visual quality, more precise text rendering, and more, allowing developers to create applications for product recontextualization, collaborative image editing, and dynamic SKU generation.Gemini 2.5 Pro Preview: even better coding performance - Google Developers Bloghttps://developers.googleblog.com/en/gemini-2-5-pro-io-improved-coding-performance/2025-05-06T16:00:55.000ZAn updated I/O edition preview of Gemini 2.5 Pro is being released for developers, featuring best-in-class front-end and UI development performance, ranking #1 on the WebDev Arena leaderboard, and showcasing applications like video to code and easier feature development through starter apps.Dos and don’ts when sunsetting open source projects - The GitHub Bloghttps://github.blog/?p=875202025-05-06T16:00:00.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Maintaining an open source project can be a big responsibility. But it’s not one you’re obligated to bear forever. Maybe usage has declined thanks to a better solution. Maybe technology has evolved to the point that it’s easier to start over with a new project than adapt an old project to a new ecosystem. Sometimes it’s time to move on, even if that means deprecating a project.</p>
<p><a href="https://github.com/ttscoff">Brett Terpstra</a>, a front-end developer, maintains more than 100 GitHub repositories and has had to retire more than a few. “Projects that rely on APIs and other outside applications often require more work than is worthwhile once things start to break,” he explained <a href="https://mailchi.mp/3d96930ae316/githubs-the-readme-project-fixing-the-technical-interview-909033?e=662aaf9c66">in a Q&A</a>. “Historically, those are the projects that get retired the fastest.”</p>
<p>Whatever your reasons, you want to sunset the project gracefully to protect your reputation and do right by your users. Here are some insights from maintainers who have navigated the process about what you should and shouldn’t do when it’s time to deprecate a project.</p>
<h2 class="wp-block-heading" id="h-don-t-keep-maintaining-something-for-too-long">Don’t: Keep maintaining something for too long</h2>
<p>The one thing <a href="https://github.com/olgabot">Olga Botvinnik</a>, a computational biologist, would tell her younger self is that she should have sunsetted her Python data visualization package <a href="https://github.com/olgabot/prettyplotlib">prettyplotlib</a> sooner. She didn’t want to abandon the project, but she had started it as part of her PhD work, felt like updating it to support Python 3 would be daunting, and was interested in moving on to other projects. Besides, another Python visualization library called <a href="https://github.com/mwaskom/seaborn">Seaborn</a> was becoming increasingly popular. </p>
<p class="purple-text text-gradient-purple-coral" style="margin-top:var(--wp--preset--spacing--40);margin-bottom:var(--wp--preset--spacing--40)">“Even if I’m immediately done working on a project, I leave the 30-day window open to take care of issues and help users transition.” – Brett Terpestra, front-end developer</p>
<p>Botvinnik thought Seaborn was better in some ways, and more polished. So she made the decision to deprecate prettyplotlib and spend her time contributing to Seaborn instead. “One of my mentors told me that knowing when to end a project is just as good as finishing it,” she says. “That made me feel a lot better about letting it go.”</p>
<h2 class="wp-block-heading" id="h-do-leave-the-door-open-for-someone-else">Do: Leave the door open for someone else</h2>
<p>That said, you shouldn’t deprecate a project without considering other options, like handing it to another maintainer. Terpstra has deprecated many projects, but he always looks for someone else to take them over first. “There are different degrees of sunsetting,” he says. In some cases, a project is so simple that it doesn’t need much maintenance. In that case, you can just make a note that you don’t often update the project while leaving the door open for new contributions.</p>
<p>Of course it’s not always appropriate to hand off a project to another maintainer. <a href="https://github.com/benbjohnson">Ben Johnson</a>, maintainer of the SQLite recovery tool Litestream, opted to retire <a href="https://github.com/boltdb/bolt">BoltDB</a> and point people towards a fork called <a href="https://github.com/etcd-io/bbolt">BBolt</a> rather than have someone take over the original. “My name and reputation were pretty closely tied to the project at the time,” says Johnson. “I was the BoltDB guy. I didn’t want to put my reputation in the hands of someone else.”</p>
<h2 class="wp-block-heading" id="h-don-t-pull-the-plug-without-notice">Don’t: Pull the plug without notice</h2>
<p>Terpstra gives at least a month’s notice before retiring a project. “Even if I’m immediately done working on a project, I leave the 30-day window open to take care of issues and help users transition,” he says.</p>
<p class="purple-text text-gradient-purple-coral" style="margin-top:var(--wp--preset--spacing--40);margin-bottom:var(--wp--preset--spacing--40)">“One of my mentors told me that knowing when to end a project is just as good as finishing it. That made me feel a lot better about letting it go.” – Olga Botvinnik, computational biologist</p>
<p>Once you’ve made the decision to deprecate a project, you need to let users know and, if possible, suggest alternatives. “I spread word through a blog post and a tweet announcing that I wasn’t going to actively fix bugs anymore and pointed people to Seaborn instead,” Botvinnik says.</p>
<h2 class="wp-block-heading" id="h-do-keep-the-code-online">Do: Keep the code online</h2>
<p>Instead of deleting your project, it’s almost always best to <a href="https://docs.github.com/en/repositories/archiving-a-github-repository/archiving-repositories">archive it</a> instead. Archiving a project makes it read-only and communicates to users that it’s no longer maintained. Everything from issues and pull requests to milestones and permissions become read-only. But you can always unarchive a project if you later decide to work on it again.</p>
<p>Deleting your project could have unintended consequences. “Anyone thinking about taking their software offline should consider whether they might be creating reproducibility problems for people in science and academia,” Botvinnik points out.</p>
<p>Keeping it online means that even if you couldn’t find someone to take it over before you deprecated the project, someone else could come along later and fork it—or at least find something useful to reuse.</p>
<p>That said, if you believe your code is actively harmful, it might be best to take it offline. For example, software with dangerous security vulnerabilities that put users at risk.</p>
<h2 class="wp-block-heading" id="h-take-this-with-you">Take this with you</h2>
<p>Ultimately, open source projects are living entities—born from passion and sustained by community. Knowing when and how to let go is not just good stewardship, it’s an essential part of the open source lifecycle.</p>
<div class="wp-block-group post-content-cta has-global-padding is-layout-constrained wp-block-group-is-layout-constrained">
<p><a href="https://docs.github.com/en/get-started/exploring-projects-on-github/finding-ways-to-contribute-to-open-source-on-github">Get started</a> contributing to open source now.</p>
</div>
</body></html>
<p>The post <a href="https://github.blog/open-source/maintainers/dos-and-donts-when-sunsetting-open-source-projects/">Dos and don’ts when sunsetting open source projects</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Welcome to Maintainer Month: Events, exclusive discounts, and a new security challenge - The GitHub Bloghttps://github.blog/?p=874912025-05-05T17:30:54.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Open source software (OSS) is everywhere—it’s the lifeblood of the modern software ecosystem. Ninety percent of companies use open source<sup>1</sup>, 97% of codebases contain open source<sup>2</sup>, 70-90% of the code within commercial tools comes from open source<sup>3</sup>, and the value of OSS globally is estimated to be $8.8 trillion<sup>4</sup>. At GitHub, we <strong>love</strong> open source—and we’re so honored to host so much open source code that we famously <a href="https://archiveprogram.github.com/arctic-vault/">preserved it in the Arctic</a>.</p>
<p>But in the same way that your office microwave doesn’t just magically get clean and your favorite park doesn’t have self-mowing grass, open source software doesn’t just <em>happen</em>. </p>
<p>We’re surrounded by human-maintained infrastructure and resources that, in our busy lives, can be easy to take for granted. This is why we started <a href="http://maintainermonth.github.com"><strong>Maintainer Month</strong></a>—a time to thank the open source software maintainers that keep projects healthy. This May marks the fifth annual Maintainer Month, and there are lots of treats in store: new badges, special discounts, events with experts, and more. In addition to the fact that the device you’re reading this on functions–thanks, open source maintainers!</p>
<h2 class="wp-block-heading" id="h-maintainer-month-events-and-livestreams"><strong>Maintainer Month events and livestreams</strong></h2>
<p>There are over 25 events and livestreams scheduled during Maintainer Month, so <a href="https://maintainermonth.github.com/schedule">head on over to the schedule</a> to see them all or add your own!</p>
<p>Everyone is welcome at these events—whether or not you’re ready to call yourself a software maintainer. Here are a couple of our favorites, since they tackle thorny issues: </p>
<ul class="wp-block-list">
<li><a href="https://maintainermonth.github.com/schedule/2025-05-06-Licensing-SBOMs"><strong>What maintainers need to know about open source licensing, SBOMs and security</strong></a><strong>: May 6, 2025</strong><strong><br></strong>Join our colleague Jeff Luszcz from the GitHub Open Source Programs Office as he reviews what every maintainer should know about these topics in the ever-evolving landscape of 2025. We get so many questions about this, and Jeff is the expert!</li>
<li><a href="https://maintainermonth.github.com/schedule/2025-05-27-CRA"><strong>The CRA and Open Source: What Maintainers Really Need to Know</strong></a><strong>: May 27, 2025</strong><strong><br></strong>Feeling stressed about the European Union’s new Cyber Resilience Act (CRA) regulations? We can help! Come to this stream with the Eclipse Foundation’s Cyber Resilience Working Group, where they’ll talk about resources and practical information for maintainers navigating these changes.</li>
</ul>
<h2 class="wp-block-heading" id="h-meet-the-2025-partner-pack"><strong>🎁 Meet the 2025 Partner Pack</strong></h2>
<p>This year, we’re launching the new <strong>Maintainer Month Partner Pack</strong>—a bundle of perks, tools, and resources from organizations that truly believe in open source. Think of it as a care package for the folks behind our digital infrastructure.</p>
<p>Here’s just a taste of what’s inside (and it’s available to all maintainers):</p>
<ul class="wp-block-list">
<li><strong>Arachne Digital: </strong>Free tailored threat report with steps to defend your project</li>
<li><strong>Boot.dev</strong>: One month of free premium access to backend dev courses</li>
<li><strong>CNCF</strong>: Discounts on select cloud native training (Kubernetes included!)</li>
<li><strong>DevCycle</strong>: A full year of the Developer plan, free for maintainers</li>
<li><strong>JSConf North America</strong>: Special discounted tickets for Maintainer Month</li>
<li><strong>Linux Foundation Education</strong>: 25% off the full course catalog</li>
<li><strong>Mockoon</strong>: Free Mockoon Cloud account to build, test, and mock APIs faster</li>
<li><strong>Sentry</strong>: Access to their open source plan for monitoring and performance</li>
<li><strong>TODO Group</strong>: 20% off the CODE certification for enterprise open source</li>
<li><strong>Web Summit:</strong> Discounted tickets to Vancouver & Lisbon for OSS contributors</li>
</ul>
<p>…and we’ll be adding more throughout May. </p>
<p>👉 <a href="http://maintainermonth.github.com/partner-pack">See all current offers and partners here</a>.</p>
<p>Some partners are offering extra perks for members of our private Maintainer Community—a vetted space to connect, share, and support each other. If you maintain an open source project,<a href="https://maintainers.github.com/"> you can request to join our Maintainer Community</a>.</p>
<h2 class="wp-block-heading" id="h-security-a-new-challenge"><strong>Security: a new challenge</strong></h2>
<p>Security is <em>kind of a big deal</em>, which is why you hear about it all the time. This is why we’re excited to launch new security guidance on<a href="https://opensource.guide"> opensource.guide</a> to help maintainers strengthen the trust and resilience of their open source projects. We’ve pulled together practical advice and tools you can start using right away to make your project safer for everyone who relies on it. Because building great open source software isn’t just about what your project does—it’s about how you protect the people who use it.</p>
<p>The new Open Source Guide on <a href="https://opensource.guide/security-best-practices-for-your-project/">Security Best Practices for Your Project</a> will walk you through the basic considerations for software security, including how to:</p>
<ul class="wp-block-list">
<li>Secure your code as part of your development workflow</li>
<li>Avoid unwanted changes with protected branches</li>
<li>Set up an intake mechanism for vulnerability reporting</li>
</ul>
<aside data-color-mode="light" data-dark-theme="dark" data-light-theme="light_dimmed" class="wp-block-group post-aside--large p-4 p-md-6 is-style-light-dimmed has-global-padding is-layout-constrained wp-block-group-is-layout-constrained is-style-light-dimmed--1" style="border-top-width:4px">
<h2 class="wp-block-heading h5-mktg gh-aside-title is-typography-preset-h5" id="h-looking-to-improve-security-on-your-project-with-expertise-and-community-apply-to-the-github-secure-open-source-fund-nbsp" style="margin-top:0">Looking to improve security on your project with expertise and community? Apply to the GitHub Secure Open Source Fund. </h2>
<p>You don’t have to secure your project alone—we’re here to support you. The GitHub Secure Open Source Fund provides a combination of funding, training sessions with experts, and access to researchers and community to get your project more secure. We’re completely serious, we will <strong>give you money ($10K)</strong> to improve your security practices! <br><br><a href="https://resources.github.com/github-secure-open-source-fund/">Apply to the GitHub Secure Open Source Fund ></a></p>
</aside>
<h3 class="wp-block-heading" id="h-security-challenge-level-up-during-maintainer-month"><strong>🔒 Security Challenge: Level up during Maintainer Month</strong></h3>
<p>Ready to boost your project’s defenses—and your own skills?</p>
<p>This May, take the Maintainer Month Security Challenge, which features three hands-on GitHub security skills while allowing you to snag a voucher for GitHub Advanced Security certification (hello, career boost!).</p>
<p>In just a few hours, you’ll pick up real techniques to protect your project—and show the world you’re serious about security. Let’s build a safer open source together.</p>
<p><a href="http://maintainermonth.github.com/security-challenge">Join the Security Challenge ></a></p>
<h2 class="wp-block-heading" id="h-how-to-get-involved-throughout-may-and-beyond"><strong>🔧 How to get involved throughout May and beyond</strong></h2>
<ul class="wp-block-list">
<li><strong>Explore the Partner Pack:</strong> <a href="https://maintainermonth.github.com">maintainermonth.github.com</a></li>
<li><strong>Join the Maintainer Community:</strong> <a href="https://maintainers.github.com">maintainers.github.com</a></li>
<li><strong>Want to contribute an offer?</strong> Email us at <a href="mailto:maintainermonth@github.com">maintainermonth@github.com</a></li>
<li><strong>Share your story:</strong> Tag #MaintainerMonth on social media to help us celebrate the humans behind the code!</li>
</ul>
<div class="wp-block-group post-content-cta has-global-padding is-layout-constrained wp-block-group-is-layout-constrained">
<p><a href="https://github.blog/open-source/">Read more</a> about what’s happening with open source.</p>
</div>
<div class="wp-block-group has-global-padding is-layout-constrained wp-block-group-is-layout-constrained">
<p class="has-small-font-size"><sup>1</sup> GitHub. 2022. “Octoverse 2022: The state of open source software.” <a href="https://octoverse.github.com/2022/">https://octoverse.github.com/2022/</a>. & OpenUK. 2021. “State of Open: The UK in 2021.” <a href="https://openuk.uk/wp-content/uploads/2021/10/openuk-state-of-open_final-version.pdf">https://openuk.uk/wp-content/uploads/2021/10/openuk-state-of-open_final-version.pdf</a>. </p>
<p class="has-small-font-size"><sup>2</sup> Blackduck. 2025. “Six takeaways from the 2025 “Open Source Security and Risk Analysis” report.” <a href="https://www.blackduck.com/blog/open-source-trends-ossra-report.html">https://www.blackduck.com/blog/open-source-trends-ossra-report.html</a>.</p>
<p class="has-small-font-size"><sup>3</sup> The Linux Foundation. 2022. “A Summary of Census II: Open Source Software Application Libraries the World Depends On.” <a href="https://www.linuxfoundation.org/blog/blog/a-summary-of-census-ii-open-source-software-application-libraries-the-world-depends-on">https://www.linuxfoundation.org/blog/blog/a-summary-of-census-ii-open-source-software-application-libraries-the-world-depends-on</a>. & Intel. 2025. “The Careful Consumption of Open Source Software.” <a href="https://www.intel.com/content/www/us/en/developer/articles/guide/the-careful-consumption-of-open-source-software.html#:~:text=Similarly%2C%20a%202022%20Linux%20Foundation,up%20of%20open%20source%20components">https://www.intel.com/content/www/us/en/developer/articles/guide/the-careful-consumption-of-open-source-software.htm</a>. </p>
<p class="has-small-font-size"><sup>4</sup> Harvard Business School. 2024. “The Value of Open Source Software.” <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4693148">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4693148</a>. </p>
</div>
</body></html>
<p>The post <a href="https://github.blog/open-source/maintainers/welcome-to-maintainer-month-events-exclusive-discounts-and-a-new-security-challenge/">Welcome to Maintainer Month: Events, exclusive discounts, and a new security challenge</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Enhancing the Python ecosystem with type checking and free threading - Engineering at Metahttps://engineering.fb.com/?p=224722025-05-05T16:00:05.000Z<p><i><span style="font-weight: 400;">Meta and Quantsight have improved key libraries in the Python Ecosystem. There is plenty more to do and we invite the community to help with our efforts. </span></i></p>
<p><span style="font-weight: 400;">We’ll look at two key efforts in Python’s packaging ecosystem to make packages faster and easier to use:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;"><img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f680.png" alt="🚀" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Unlock performance wins for developers through free-threaded Python – where we leverage Python 3.13’s support for concurrent programming (made possible by removing the Global Interpreter Lock (GIL)). </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;"><img src="https://s.w.org/images/core/emoji/15.1.0/72x72/2705.png" alt="✅" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Increase developer velocity in the IDE with improved type annotations.</span></li>
</ul>
<h2><span style="font-weight: 400;">Enhancing typed Python in the Python scientific stack</span></h2>
<p><span style="font-weight: 400;">Type hints, introduced in Python 3.5 with </span><a href="https://peps.python.org/pep-0484/" target="_blank" rel="noopener"><span style="font-weight: 400;">PEP-484</span></a><span style="font-weight: 400;">, allow developers to specify variable types, enhancing code understanding without affecting runtime behavior. Type-checkers validate these annotations, helping prevent bugs and improving IDE functions like autocomplete and jump-to-definition. Despite their benefits, adoption is inconsistent across the open source ecosystem, with varied approaches to specifying and maintaining type annotations.</span></p>
<p><span style="font-weight: 400;">The landscape of open source software is fractured with respect to how type annotations are specified, maintained, and distributed to end users. Some projects have in-line annotations (types directly declared in the source code directly), others keep types in stub files, and many projects have no types at all, relying on third party repositories such as the </span><a href="https://github.com/python/typeshed" target="_blank" rel="noopener"><span style="font-weight: 400;">typeshed</span></a><span style="font-weight: 400;"> to provide community-maintained stubs. Each approach has its own pros and cons, but application and maintenance of them </span><a href="https://discuss.python.org/t/prevalence-staleness-of-stubs-packages-in-pypi/70457" target="_blank" rel="noopener"><span style="font-weight: 400;">has been inconsistent</span></a><span style="font-weight: 400;">.</span></p>
<p><span style="font-weight: 400;">Meta and Quansight are addressing this inconsistency through:</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><b>Direct contributions:</b><span style="font-weight: 400;"> We have improved the type coverage for pandas-stubs and numpy, and are eager to expand the effort to more packages. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Community engagement:</b><span style="font-weight: 400;"> Promoting type annotation efforts to encourage community involvement, listen to feedback and create actionable ways to improve the ecosystem. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Tooling and automation:</b><span style="font-weight: 400;"> Developing tools to address common challenges adding types and keeping the types up-to-date with the source code.</span></li>
</ol>
<h2><span style="font-weight: 400;">Improved type annotations in pandas</span></h2>
<p><span style="font-weight: 400;">TL;DR: </span><i><span style="font-weight: 400;">Pandas is the second most downloaded package from the Python scientific stack. We improved </span></i><a href="https://github.com/pandas-dev/pandas-stubs/" target="_blank" rel="noopener"><i><span style="font-weight: 400;">pandas-stubs</span></i></a><i><span style="font-weight: 400;"> package type annotation coverage from 36% to over 50%.</span></i></p>
<h3><span style="font-weight: 400;">Background</span></h3>
<p><span style="font-weight: 400;">The pandas community maintains its own stubs in a separate repository, which must be installed to obtain type annotations. While these stubs are checked separately from the source code, it allows the community to use types with their own type checking and IDE. </span></p>
<h3><span style="font-weight: 400;">Improving type coverage</span></h3>
<p><span style="font-weight: 400;">When we began our work in pandas-stubs, coverage was around 36%, as measured by the percentage of parameters, returns, and attributes that had a complete type annotation (the annotation is present and all generics have type arguments). After several weeks of work and about 30 PRs, type completeness is now measured at over 50%. The majority of our contributions involved adding annotations to previously-untyped parameters, adding type arguments to raw generic types, and removing deprecated/undocumented interfaces. We also improved several inaccurate annotations and updated others to match the inline annotations in the pandas source code. </span></p>
<h3><span style="font-weight: 400;">Key introductions</span></h3>
<p><span style="font-weight: 400;">Two key introductions significantly increased coverage:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Replacing raw </span><span style="font-weight: 400; font-family: 'courier new', courier;">Series</span><span style="font-weight: 400;"> types with </span><span style="font-weight: 400; font-family: 'courier new', courier;">UnknownSeries</span><span style="font-weight: 400;">, a new type aliased to </span><span style="font-weight: 400; font-family: 'courier new', courier;">Series[Any]</span><span style="font-weight: 400;">. When applied to return type annotations, this reduces the number of type checker false-positives when the function is called.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Improving types of core Dataframe operations like insert, combine, replace, transpose, and assign, as well as many timestamp and time-zone related APIs.</span></li>
</ul>
<h3><span style="font-weight: 400;">Tooling development</span></h3>
<p><span style="font-weight: 400;">In addition to improving coverage directly, we developed tooling to catalog public interfaces missing annotations. We also augmented our tools for measuring type coverage to handle the situation where stubs are distributed independently, rather than being packaged into the core library wheel.</span></p>
<h2><span style="font-weight: 400;">What is free-threaded Python ?</span></h2>
<p><span style="font-weight: 400;">Free-threaded Python (FTP) is an experimental build of CPython that allows multiple threads to interact with the VM in parallel. Previously, access to the VM required holding the global interpreter lock (GIL), thereby serializing execution of concurrently running threads. With the GIL becoming optional, developers will be able to take full advantage of multi-core processors and write truly parallel code.</span></p>
<h3><span style="font-weight: 400;">Benefits of free-threaded Python</span></h3>
<p><span style="font-weight: 400;">The benefits of free-threaded Python are numerous:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>True parallelism in a single process</b><span style="font-weight: 400;">: With the GIL removed, developers can write Python code that takes full advantage of multi-core processors without needing to use multiple processes. CPU-bound code can execute in parallel across multiple cores.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Improved performance:</b><span style="font-weight: 400;"> By allowing multiple threads to execute Python code simultaneously, work can be effectively distributed across multiple threads inside a single process.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Simplified concurrency:</b><span style="font-weight: 400;"> Free-threading provides developers with a more ergonomic way to write parallel programs in Python. Gone are the days of needing to use </span><span style="font-weight: 400; font-family: 'courier new', courier; color: #008000;">multiprocessing.Pool</span><span style="font-weight: 400;"> and/or resorting to custom shared memory data structures to efficiently share data between worker processes.</span></li>
</ul>
<h3><span style="font-weight: 400;">Getting Python’s ecosystem ready for FTP</span></h3>
<p><span style="font-weight: 400;">The ecosystem of Python packages must work well with free-threaded Python in order for it to be practically useful; application owners can’t use free-threading unless their dependencies work well with it. To that end, we have been taking a “bottoms up” approach to tackle the most difficult/popular packages in the ecosystem. </span><a href="https://py-free-threading.github.io/tracking/" target="_blank" rel="noopener"><span style="font-weight: 400;">We’ve added free-threading support</span></a><span style="font-weight: 400;"> to many of the most popular packages used for scientific computing (e.g. numpy, scipy, scikit-learn) and language bindings (e.g. Cython, nanobind, pybind, PyO3).</span></p>
<h2><span style="font-weight: 400;">Just getting started</span></h2>
<p><span style="font-weight: 400;">Together, we made substantial progress in improving type annotations and free-threading compatibility in Python libraries. We couldn’t have done it without the Python community and are asking others to join our efforts. Whether it’s</span> <a href="https://discuss.python.org/t/call-for-suggestions-nominate-python-packages-for-typing-improvements/80186" target="_blank" rel="noopener"><span style="font-weight: 400;">further updates to the type annotations</span></a><span style="font-weight: 400;"> or</span> <a href="https://py-free-threading.github.io/porting/" target="_blank" rel="noopener"><span style="font-weight: 400;">preparing your code for FTP</span></a><span style="font-weight: 400;">, we value your help moving the Python ecosystem forward!</span></p>
<p><span style="font-weight: 400;">To learn more about Meta Open Source, visit our </span><a href="https://opensource.fb.com/" target="_blank" rel="noopener"><span style="font-weight: 400;">open source site</span></a><span style="font-weight: 400;">, subscribe to our </span><a href="https://www.youtube.com/channel/UCCQY962PmHabTjaHv2wJzfQ" target="_blank" rel="noopener"><span style="font-weight: 400;">YouTube channel</span></a><span style="font-weight: 400;">, or follow us on </span><a href="https://www.facebook.com/MetaOpenSource" target="_blank" rel="noopener"><span style="font-weight: 400;">Facebook</span></a><span style="font-weight: 400;">, </span><a href="https://www.threads.net/@metaopensource" target="_blank" rel="noopener"><span style="font-weight: 400;">Threads</span></a><span style="font-weight: 400;">, </span><a href="https://x.com/MetaOpenSource" target="_blank" rel="noopener"><span style="font-weight: 400;">X</span></a><span style="font-weight: 400;"> and </span><a href="https://www.linkedin.com/showcase/meta-open-source?fbclid=IwZXh0bgNhZW0CMTEAAR2fEOJNb7zOi8rJeRvQry5sRxARpdL3OpS4sYLdC1_npkEy60gBS1ynXwQ_aem_mJUK6jEUApFTW75Emhtpqw" target="_blank" rel="noopener"><span style="font-weight: 400;">LinkedIn</span></a><span style="font-weight: 400;">.</span></p>
<p>The post <a rel="nofollow" href="https://engineering.fb.com/2025/05/05/developer-tools/enhancing-the-python-ecosystem-with-type-checking-and-free-threading/">Enhancing the Python ecosystem with type checking and free threading</a> appeared first on <a rel="nofollow" href="https://engineering.fb.com">Engineering at Meta</a>.</p>
Copilot ask, edit, and agent modes: What they do and when to use them - The GitHub Bloghttps://github.blog/?p=874372025-05-02T16:00:00.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>If you have opened Copilot Chat in VS Code lately and didn’t notice the tiny dropdown hiding at the bottom, you’re not alone. It’s easy to miss, especially when you’re heads down trying to get something shipped. That little menu is where you can switch between ask, edit, and agent modes—three ways you can integrate Copilot into your workflow.</p>
<figure class="wp-block-video"><video controls src="https://github.blog/wp-content/uploads/2025/05/7628641195839613148mode-selector.mp4"></video></figure>
<p>But here’s the thing: Each of these modes does something pretty different. And depending on what kind of developer you are, how much context you have, or how much control you’re comfortable giving <a href="https://github.com/features/copilot">GitHub Copilot</a>, you might have a very different experience with each one. </p>
<p>I’ve been using all three in different ways, from building apps and testing ideas to playing with frameworks I haven’t touched in a while. I’ve also spent time talking to other developers about what’s working for them. </p>
<p>Below is what I’ve learned so far. This isn’t <em>the</em> manual (for that, look at the <a href="https://code.visualstudio.com/docs/copilot/chat/copilot-chat#_chat-mode">VS Code documentation</a>). It’s more of a guide for how to think about these tools, as you navigate where they fit into your own workflow.</p>
<h2 class="wp-block-heading" id="h-ask-mode-the-quick-gut-check">Ask mode: The quick gut check</h2>
<p>Ask mode is the simplest of the three—and if you are a long-time user of GitHub Copilot, it might be the only time you’ve thought about bringing up the Chat window. </p>
<p>Here’s how it works: You highlight some code, type a question into Copilot Chat, and it generates an answer. It might explain what the code does, suggest how to test it, give you a code snippet that implements what you are asking about, or remind you how to handle a particular edge case. </p>
<p>Ask mode is fast, helpful, and focused entirely on answering your programming question without touching your code. You can stay in your editor and ask questions that Copilot can answer using all the context of your current editor environment. Think of it like a quiet little whisper in your editor saying, “Hey, here’s what I think this means.”</p>
<p>And it’s not just about your code. You can ask it anything related to programming—like how to use a certain library, how to structure a SQL query, and even which search algorithm is more efficient for a given dataset. </p>
<p>Need help styling something with Tailwind? Want a refresher on closures in JavaScript? Curious how to debounce an input in React? Copilot can help with that, too.</p>
<p>There’s no project commitment, no architectural decisions, and no code changes. Just answers, right when you need them. It’s the lowest-friction way to get unstuck when a question is standing in the way of you and building what’s next.</p>
<h2 class="wp-block-heading" id="h-edit-mode-you-re-still-in-charge-just-moving-faster">Edit mode: You’re still in charge, just moving faster</h2>
<p>Edit mode in VS Code is where things start to get more interesting. It lets you pick any number of files in your project you want to change and describe the update in natural language. Then, Copilot will immediately apply inline, review-ready code edits across those files. </p>
<p>Edit mode is perfect when you know what you want to do but don’t necessarily want to write it all out yourself. You highlight a block of code, type in an instruction—perhaps something like “add error handling” or “refactor this using async/await”—and Copilot rewrites the code for you. But (and this is important) it doesn’t save anything without showing you the diff first.</p>
<p>That’s what makes edit mode so reliable. Copilot does the work, but you get the final say. You’re not handing over the reins. You’re speeding things up while staying fully in the loop.</p>
<p>You can also <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot">bring custom instructions</a> into edit mode if you want to level it up a bit. It’s a way to teach Copilot how you and your team like to write code, including your style preferences, how verbose or concise you want it to be, what your team’s standards are, how formal or casual you want it when it explains things, and even what language you want your comments written in. Setting up those preferences is like giving Copilot a playbook ahead of time—so when you say “clean this up,” it already knows what “clean” actually means to you.</p>
<p>I’ve found myself reaching for edit mode when I’m deep in a brownfield app and don’t want to touch the rest of the system, or when I’m just trying to get through a handful of small but annoying improvements with surgical precision. It’s not trying to redesign your architecture. It’s not going to rename things you didn’t ask it to. It’s just here to make the thing you’re working on better and faster, with less mental overhead.</p>
<p>Honestly, once you get used to it, it’s hard to go back.</p>
<h2 class="wp-block-heading" id="h-agent-mode-a-lot-of-power-when-you-re-ready-for-it">Agent mode: A lot of power, when you’re ready for it</h2>
<p>Now let’s talk about Agent mode. Agent mode lets you hand it a high-level prompt and then watch as Copilot autonomously plans the steps, selects the right files, runs tools or terminal commands, and iterates on code edits until the task is complete.</p>
<p>It is by far the most powerful mode in Copilot Chat—and also the newest, which means for a lot of people, it’s still the least familiar. Agent mode can reason across your entire project, take multi-step actions, and hold onto a significant amount of context across a session. You can ask it to build features, fix bugs, create files, clean up routing logic, or even scaffold an entire section of an app based on a single prompt.</p>
<figure class="wp-block-video"><video controls src="https://github.blog/wp-content/uploads/2025/05/sn0404-agent-dark.mp4"></video></figure>
<p>At first glance, agent mode can look like an expanded version of edit mode—and in some respects, it is. But there’s a crucial distinction: Instead of only rewriting the lines you specify, agent mode analyzes related code, identifies additional changes that may be required, and applies them across the project to keep everything consistent.</p>
<p>Another key distinction? Agent mode applies edits automatically rather than waiting for explicit approval, while still surfacing any potentially risky commands for review before they run.</p>
<p>The workflow is closer to a continuous-edit “driver” model: The developer defines the goal, and Copilot executes updates without stopping for permission at every step. </p>
<p>For some developers, that feels natural and empowering. For others, it can feel like giving up a little more control than you are used to.</p>
<p>One thing that makes agent mode even better in real-world projects is <a href="https://code.visualstudio.com/docs/copilot/copilot-customization#_custom-instructions">custom instructions</a>, which I mentioned above. This is where you can really start shaping how Copilot behaves across a session. </p>
<p>For example, here are the custom instructions we used with agent mode in one of our demo projects: </p>
<pre class="wp-block-code"><code>This is a Next.js-based travel application with TypeScript that helps users search for trips, manage bookings, view travel guides, and track points. The application uses React components, server components, and client components as part of the Next.js App Router architecture. Please follow these guidelines when contributing:
## Code Standards
### Required Before Each Commit
- Run `npm run lint` to ensure code follows project standards
- Make sure all components follow Next.js App Router patterns
- Client components should be marked with 'use client' when they use browser APIs or React hooks
- When adding new functionality, make sure you update the README
- Make sure that the repository structure documentation is correct and accurate in the Copilot Instructions file
- Ensure all tests pass by running `npm run test` in the terminal
### TypeScript and React Patterns
- Use TypeScript interfaces/types for all props and data structures
- Follow React best practices (hooks, functional components)
- Use proper state management techniques
- Components should be modular and follow single-responsibility principle
### Styling
- You must prioritize using Tailwind CSS classes as much as possible. If needed, you may define custom Tailwind Classes / Styles. Creating custom CSS should be the last approach.
## Development Flow
- Install dependencies: `npm install`
- Development server: `npm run dev`
- Build: `npm run build`
- Test: `npm run test`
- Lint: `npm run lint`
## Repository Structure
- `app/`: Next.js App Router pages and layouts organized by route
- `components/`: Reusable React components
- `components/ui/`: UI components (buttons, inputs, etc.)
- `components/__tests__/`: Component tests
- `lib/`: Core logic and services
- `lib/data/`: Data models and mock data
- `lib/types/`: TypeScript type definitions
- `public/`: Static assets
- `tests/`: Test files and test utilities
- `README.md`: Project documentation
## Key Guidelines
1. Make sure to evaluate the components you're creating, and whether they need 'use client'
2. Images should contain meaningful alt text unless they are purely for decoration. If they are for decoration only, a null (empty) alt text should be provided (alt="") so that the images are ignored by the screen reader.
3. Follow Next.js best practices for data fetching, routing, and rendering
4. Use proper error handling and loading states
5. Optimize components and pages for performance</code></pre>
<p>If you’ve ever noticed that agent mode sometimes forgets how it structured your backend when it later builds the frontend, you’re not imagining things. The context window is big (and getting bigger all the time), but it’s still finite. Instead of stuffing every little reminder into your prompts, you can use custom instructions to set some ground rules ahead of time: things like how you want APIs called, naming patterns you want followed, or even stylistic preferences across your codebase. It gives agent mode a stronger foundation to work from, which means you get more consistency without needing to micromanage every step.</p>
<p>I won’t go too deep into custom instructions here (the <a href="https://code.visualstudio.com/docs/copilot/copilot-customization#_custom-instructions">VS Code documentation does a great job</a>), but if you are working across multiple sessions or building something bigger than a one-off script, they are absolutely worth looking into. It has made a noticeable difference for me, especially on side projects where I want to move fast but keep some structure in place.</p>
<p>I have had some really strong results using agent mode this way. I’ve started projects by dropping a README into a new repo, setting a clear vision for what I want, and letting Copilot take the first pass. It builds out components, layouts, routes, and even seeds the content. It isn’t perfect straight out of the gate (and it shouldn’t be), but it gets me so much closer to a usable starting point than building from scratch. The more detailed and thoughtful my initial prompts and setup, the better agent mode performs.</p>
<p>Of course, it’s still important to stay engaged. Occasionally, agent mode might suggest running a command you don’t expect. Or it might touch a file you thought you had agreed to leave alone. And sometimes, it just takes a little longer to reason through things, especially on bigger projects. </p>
<p>In live demo situations where time is tight, that unpredictability can definitely make things more… exciting. But when I’m building day-to-day, especially when experimenting or kicking off a new project, agent mode fits naturally into my creative process.</p>
<p>Working with agent mode is a lot like pairing with that brilliant friend who moves fast and sometimes thinks three steps ahead. If you’re aligned, amazing things can happen. If you’re not, you might have to nudge them back onto the path you want. </p>
<p>The key is good communication prompt files, thoughtful initial instructions, and clear nudges along the way. This keeps the collaboration productive and fun.</p>
<h2 class="wp-block-heading" id="so-what-does-this-mean-for-senior-developers">So what does this mean for senior developers?</h2>
<p>This question has come up a lot in conversations with my team. Agent mode definitely feels like the future when you use it, but just because it is powerful does not mean it is always the right tool for the job. Sometimes you’ll be working on part of a codebase that needs to be handled with a little more precision. If you are tweaking a couple of files or making a targeted change in a sensitive system, ask or edit mode might be the better fit.</p>
<p>A lot of the time, the most experienced developers are the ones who know exactly which parts of a system should be touched carefully, or maybe not touched at all, to avoid bigger problems later. That isn’t about being cautious for the sake of it. It’s about understanding the complexity that lives under the surface.</p>
<p>And here’s something important: Agent mode isn’t just for people who are new to building software. In a lot of ways, it actually works best when the person using it knows how to give clear, strong instructions—and the nuances of code and algorithmic structures. If you know how your system is structured, where the fragile edges are, and when to review changes vs. when to trust the flow, you’re in a great spot to get real value out of agent mode. That is senior dev territory.</p>
<p>Of course, strong opinions come with the territory too. Agent mode is doing its best, but unless you are explicit about the rules you want it to follow, it isn’t going to automatically pick up your conventions. That is where custom instructions come into the picture. When senior engineers write down the things they normally just know: the naming patterns, the design principles, the places to be careful, and commit those into version control alongside the project, it gives the entire team a better experience. It doesn’t just make agent mode faster—it helps it give you better suggestions, too.</p>
<p>There will still be plenty of moments where edit mode feels like the better choice. When you just need a second set of eyes without handing over the whole keyboard, edit mode keeps you moving without ever getting too far ahead. And that’s fine. </p>
<p>Honestly, I think that as a community we could do a better job showing how agent mode complements expertise instead of only showing it in greenfield examples. Demos that start from zero are easy to understand. But in the real world, most developers are iterating on existing systems—not spinning up a new app every week. Showing how Copilot fits into that kind of work—the messy, important middle—is where things really get exciting.</p>
<h2 class="wp-block-heading" id="take-this-with-you">Take this with you</h2>
<p>Ask, edit, and agent modes aren’t three versions of the same tool. They’re three completely different and unique experiences within GitHub Copilot. Ask mode is the quick way to get an answer to your question. Edit mode is the assistant telling you what it recommends across your files. And agent mode is the assistant that just goes ahead and does what it thinks you are asking for—which is great as long as it’s following your spoken (and unspoken) instructions. </p>
<p>If you’ve only tried one mode so far, now’s a good time to play around. Try using agent mode with a fresh repository. Or try using edit mode on a complex refactoring job you’ve been putting off. And maybe try using ask mode when you’re trying to remember what a slice reducer does because it’s Monday and your brain’s still booting up.</p>
<p>All of these tools are designed to complement your judgment. They’re here to help you do your job better, whatever that looks like for you.</p>
<p>And no matter what, always read the diff.</p>
<div class="wp-block-group post-content-cta has-global-padding is-layout-constrained wp-block-group-is-layout-constrained">
<p><a href="https://docs.github.com/en/copilot/using-github-copilot/ai-models/choosing-the-right-ai-model-for-your-task">Learn more about AI models.</a></p>
</div>
</body></html>
<p>The post <a href="https://github.blog/ai-and-ml/copilot-ask-edit-and-agent-modes-what-they-do-and-when-to-use-them/">Copilot ask, edit, and agent modes: What they do and when to use them</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Taking the plunge: Why Meta is laying the world’s longest subsea cable - Engineering at Metahttps://engineering.fb.com/?p=224812025-05-01T18:48:02.000Z<p><span style="font-weight: 400;">Meta develops infrastructure all across the globe to transport information and content for the billions of people using our services around the world. At the core of this infrastructure are aggregation points – like data centers – and the digital cables that connect them. Subsea cables – the unseen digital highways of the internet – are critical for Meta to serve people wherever they are in the world. In fact, more than 95% of the world’s intercontinental traffic goes through subsea cables. </span></p>
<p><span style="font-weight: 400;">Meta’s engineering team prioritizes both innovation and quality when designing and deploying these cables. In the latest Meta Tech Podcast, </span><span style="font-weight: 400;">Andy Palmer-Felgate and Pascal Pecci, both subsea cable systems engineers, join</span> <a href="https://www.threads.net/@passy_"><span style="font-weight: 400;">Pascal Hartig</span></a> <span style="font-weight: 400;">on the Meta Tech podcast to discuss the latest in subsea engineering technology. This episode dives deeper into the engineering nuances of large-scale subsea cable projects like the recently announced</span> <a href="https://engineering.fb.com/2025/02/14/connectivity/project-waterworth-ai-subsea-infrastructure/"><span style="font-weight: 400;">Project Waterworth</span></a><span style="font-weight: 400;">. </span></p>
<p><span style="font-weight: 400;">Learn more about Meta’s work on these engineering feats. Download or listen to the episode below:</span></p>
<p><iframe style="border: none;" title="Libsyn Player" src="//html5-player.libsyn.com/embed/episode/id/36358920/height/90/theme/custom/thumbnail/yes/direction/forward/render-playlist/no/custom-color/000000/" width="100%" height="90" scrolling="no" allowfullscreen="allowfullscreen"></iframe></p>
<p><span style="font-weight: 400;">The</span> <a href="https://insidefacebookmobile.libsyn.com/"><span style="font-weight: 400;">Meta Tech Podcast</span></a><span style="font-weight: 400;"> is a podcast, brought to you by Meta, where we highlight the work Meta’s engineers are doing at every level – from low-level frameworks to end-user features.</span></p>
<p><span style="font-weight: 400;">Send us feedback on </span><a href="https://instagram.com/metatechpod"><span style="font-weight: 400;">Instagram</span></a><span style="font-weight: 400;">, </span><a href="https://threads.net/@metatechpod"><span style="font-weight: 400;">Threads</span></a><span style="font-weight: 400;">, or </span><a href="https://twitter.com/metatechpod"><span style="font-weight: 400;">X</span></a><span style="font-weight: 400;">.</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">And if you’re interested in learning more about career opportunities at Meta, visit the</span> <a href="https://www.metacareers.com/?ref=engineering.fb.com"><span style="font-weight: 400;">Meta Careers</span></a> <span style="font-weight: 400;">page.</span></p>
<p>The post <a rel="nofollow" href="https://engineering.fb.com/2025/05/01/connectivity/taking-the-plunge-why-meta-is-laying-the-worlds-longest-subsea-cable/">Taking the plunge: Why Meta is laying the world’s longest subsea cable</a> appeared first on <a rel="nofollow" href="https://engineering.fb.com">Engineering at Meta</a>.</p>
The AI-Powered DevOps revolution: Redefining developer collaboration - The GitHub Bloghttps://github.blog/?p=873772025-05-01T17:12:05.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>When it comes to mastering DevOps, it’s often not the technical skills that trip us up, but rather the more critical aspects of collaboration and communication. Communication challenges, vague requirements, and missing documentation can leave us guessing on stakeholder intent. Plus, siloed workflows can cause teams to face inconsistencies in their processes, tooling, and time to delivery, as there are so many moving parts. This all works against DevOps best practices.</p>
<p>In this blog, we will look at ways that we can enhance team productivity, reduce cognitive load, and implement a more seamless collaboration across teams and tools to improve code quality and create faster delivery cycles. Adding a bit of AI into our workflows will get us on the right path.</p>
<figure class="wp-block-table"><table class="has-fixed-layout"><tbody><tr><td>💡 <strong>Tip:</strong> When using Copilot (or any generative AI), it’s always a good idea to review any suggestions before accepting them. There’s a reason we refer to GitHub Copilot as an assistant. It is meant to augment your abilities, not serve as a replacement for your skills.</td></tr></tbody></table></figure>
<h2 class="wp-block-heading" id="h-filling-in-communication-gaps-and-documentation">Filling in communication gaps and documentation</h2>
<p>When I get started on a new project, the first thing I do is seek out the README or documentation, if there is any. While we should be writing documentation for our code as we go, the reality is that we often do not. This gap, between intention and practice, creates consistent challenges for our teams as we try to onboard new members or revisit older projects.</p>
<p>Nowadays, to get the documentation that I need, whether it’s working on a legacy code base or joining a new community project, I start by opening either <a href="https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-chat">GitHub Copilot Chat in VSCode</a> or in the github.com immersive chat experience to ask Copilot to explain the codebase to me. And when the existing README is lacking, I can then ask Copilot to help me to write a better one.</p>
<p>For example, here’s a typical placeholder README:</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops1.png"><img data-recalc-dims="1" fetchpriority="high" decoding="async" width="1600" height="613" src="https://github.blog/wp-content/uploads/2025/04/devops1.png?resize=1600%2C613" alt='Screenshot of a placeholder README with the header: "Welcome to Tailwind Traders Mail Service." It explains that the service will send transactional emails and also that it is a work in progress.' class="wp-image-87380" srcset="https://github.blog/wp-content/uploads/2025/04/devops1.png?w=1600 1600w, https://github.blog/wp-content/uploads/2025/04/devops1.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/devops1.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/devops1.png?w=1024 1024w, https://github.blog/wp-content/uploads/2025/04/devops1.png?w=1536 1536w" sizes="(max-width: 1000px) 100vw, 1000px" /></a></figure>
<p>Beyond a brief description, there’s not much information. I opened up VSCode and GitHub Copilot Chat, attaching the existing README file to give Copilot context, and asked it to simply make my README more detailed.</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops2.png"><img data-recalc-dims="1" decoding="async" width="1440" height="540" src="https://github.blog/wp-content/uploads/2025/04/devops2.png?resize=1440%2C540" alt='A screenshot showing a GitHub user asking GitHub Copilot, "Could you write a better README for this project with more detail please?"' class="wp-image-87381" srcset="https://github.blog/wp-content/uploads/2025/04/devops2.png?w=1440 1440w, https://github.blog/wp-content/uploads/2025/04/devops2.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/devops2.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/devops2.png?w=1024 1024w" sizes="(max-width: 1000px) 100vw, 1000px" /></a></figure>
<p>After asking Copilot to help create a better README, here’s how it looked:</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops3.png"><img data-recalc-dims="1" decoding="async" width="896" height="1600" src="https://github.blog/wp-content/uploads/2025/04/devops3.png?resize=896%2C1600" alt="A screenshot of a README for Tailwind Traders Mail Service with sections including Table of Contents, Overview, Work In Progress, Getting Started, and CLI Application." class="wp-image-87382" srcset="https://github.blog/wp-content/uploads/2025/04/devops3.png?w=896 896w, https://github.blog/wp-content/uploads/2025/04/devops3.png?w=168 168w, https://github.blog/wp-content/uploads/2025/04/devops3.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/devops3.png?w=573 573w, https://github.blog/wp-content/uploads/2025/04/devops3.png?w=860 860w" sizes="(max-width: 896px) 100vw, 896px" /></a></figure>
<p>GitHub Copilot provided a project overview, installation and configuration steps, and some usage instructions. This gives me, and anyone else who visits this project, a better understanding of what the project is doing and how to get started.</p>
<p>You can use GitHub Copilot not only to generate a README once you’ve written your code, but also function and class descriptions. Copilot can improve readability of code by generating descriptions, adding context around what your code is doing, and writing documentation that can reduce the learning curve and onboarding time.</p>
<p>For example, check out this bit of code where Copilot helped write inline comments:</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops4.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="682" height="438" src="https://github.blog/wp-content/uploads/2025/04/devops4.png?resize=682%2C438" alt="A screen shot of documentation in code." class="wp-image-87383" srcset="https://github.blog/wp-content/uploads/2025/04/devops4.png?w=682 682w, https://github.blog/wp-content/uploads/2025/04/devops4.png?w=300 300w" sizes="auto, (max-width: 682px) 100vw, 682px" /></a></figure>
<p>Oftentimes, the various technical teams (operations, security, and development) operate independently, not fully aware of other teams’ standards, dependencies, or changes. Using Copilot to create better documentation and code comments can help us to break down these silos.</p>
<h2 class="wp-block-heading" id="h-ai-powered-code-reviews-and-pull-requests">AI-powered code reviews and pull requests</h2>
<p>So, when we do make a code change that impacts other teams, how can we better review those changes to ensure a more seamless and secure delivery? Let’s take a look.</p>
<p>You know when you submit your pull request and you’re in a rush because of task overload? In the example above, we used docstrings to explain our functions and that is extremely helpful when you’re working with complex functions. Copilot can help write your docstrings to provide clear parameter explanations, detailed method descriptions, and context about usage. This helps to reduce the ambiguity for your code reviewers and provides context to the changes that were made.</p>
<p>When you get ready to submit your commit, instead of having to think of a funny and witty commit message yourself, select the AI-enhanced commit option and let Copilot generate a clear, concise summary of your changes for you:</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops5.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="722" height="281" src="https://github.blog/wp-content/uploads/2025/04/devops5.png?resize=722%2C281" alt="A screenshot showing a code comment." class="wp-image-87384" srcset="https://github.blog/wp-content/uploads/2025/04/devops5.png?w=722 722w, https://github.blog/wp-content/uploads/2025/04/devops5.png?w=300 300w" sizes="auto, (max-width: 722px) 100vw, 722px" /></a></figure>
<p>How often when we are reviewing a pull request there is little to no description? We can use Copilot to help with that, too! After I have committed my code, I can use the “summary” option to generate a summary of my pull request:</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops6.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="903" height="353" src="https://github.blog/wp-content/uploads/2025/04/devops6.png?resize=903%2C353" alt="A screenshot showing how to use the GitHub pull request summary feature." class="wp-image-87385" srcset="https://github.blog/wp-content/uploads/2025/04/devops6.png?w=903 903w, https://github.blog/wp-content/uploads/2025/04/devops6.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/devops6.png?w=768 768w" sizes="auto, (max-width: 903px) 100vw, 903px" /></a></figure>
<p>Copilot reviews the commit and the files changed, and then outputs a thorough summary of the changes with links to the changed files.</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops-7.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="903" height="474" src="https://github.blog/wp-content/uploads/2025/04/devops-7.png?resize=903%2C474" alt="A screenshot showing a pull request summary generated by Copilot." class="wp-image-87386" srcset="https://github.blog/wp-content/uploads/2025/04/devops-7.png?w=903 903w, https://github.blog/wp-content/uploads/2025/04/devops-7.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/devops-7.png?w=768 768w" sizes="auto, (max-width: 903px) 100vw, 903px" /></a></figure>
<p>Oftentimes, we submit a pull request and expect the reviewer to figure out what changes were made, as if they had a crystal ball. In this example, I’ve used Copilot to generate a short summary of the changes in my commit, as well as a longer, more detailed summary of the pull request. This saved me time, but more importantly, it has provided a much higher quality output than I would have been able to do on my own.</p>
<p>Copilot’s ability to summarize and document can be hugely beneficial in creating consistent terminology and communication across teams, eliminating collaboration breakdowns and enhancing code reviews.</p>
<p>Along with using Copilot to summarize my pull request before I send up our commit, I can also set Copilot as one of my reviewers by clicking “Reviewers” in the top right hand corner of my screen.</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devlops-8.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1582" height="1600" src="https://github.blog/wp-content/uploads/2025/04/devlops-8.png?resize=1582%2C1600" alt="A screenshot showing the GitHub Copilot Pull Request view." class="wp-image-87387" srcset="https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=1582 1582w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=297 297w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=1012 1012w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=1519 1519w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=90 90w, https://github.blog/wp-content/uploads/2025/04/devlops-8.png?w=116 116w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a></figure>
<p>Once I do that, Copilot reviews my pull request before my other team members, allowing me to find typos and other errors, and iterate my code before asking others to review it. This enables me to iterate faster, getting direct feedback more quickly on my changes and lessening the time my teammates need to take to review my code. It also allows me to learn how to improve my code quality, too.</p>
<h2 class="wp-block-heading" id="h-resolving-merge-conflicts-with-copilot">Resolving merge conflicts with Copilot</h2>
<p>So far, I’ve used Copilot to better define the project README, add inline comments and documentation, and generate more relevant, thorough, and precise commit messages and precise pull request summaries. But most of these changes have been me acting alone—what about when I’m collaborating with my team and I run into a merge conflict? GitHub Copilot can help there, too!</p>
<p>Sometimes, collaboration leads to messy changes and you arrive at a difficult point, trying to decide which version to ultimately commit based on code patterns, environment variables, and comments. There are a couple different ways that Copilot can help you remediate a merge conflict.</p>
<p>When you’re working in the editor, specifically VSCode, it will display the conflicting section of code. You can open a Copilot Chat window or ask inline, “How should I resolve this merge conflict?” Copilot will analyze both versions of code, suggest a resolution, and explain its reasoning. If the suggestion fits, you can accept the solution or ask for alternatives.</p>
<p>If you’re using Copilot on github.com (and licensed for GitHub Enterprise) you can <a href="https://docs.github.com/enterprise-cloud@latest/copilot/using-github-copilot/copilot-chat/asking-github-copilot-questions-in-github#ask-why-a-workflow-has-failed">also ask Copilot why a workflow has failed</a>. Just navigate to the pull request, select the details of the failing checks, and click on the GitHub Copilot icon next to the search bar. Then, you can ask Copilot directly, “Why has this pull request failed?”</p>
<figure class="wp-block-image"><a href="https://github.blog/wp-content/uploads/2025/04/devops-9.png"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1600" height="959" src="https://github.blog/wp-content/uploads/2025/04/devops-9.png?resize=1600%2C959" alt="A screenshot showing the Copilot chat in the GitHub.com UI." class="wp-image-87388" srcset="https://github.blog/wp-content/uploads/2025/04/devops-9.png?w=1600 1600w, https://github.blog/wp-content/uploads/2025/04/devops-9.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/devops-9.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/devops-9.png?w=1024 1024w, https://github.blog/wp-content/uploads/2025/04/devops-9.png?w=1536 1536w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a></figure>
<p>This is a great way to find a quick resolution to a common problem, especially with long running features branches and automation tasks.</p>
<h2 class="wp-block-heading" id="h-transformed-collaboration-and-delivery">Transformed collaboration and delivery</h2>
<p>One of the fundamental challenges in building a high performing DevOps team has always been the burden of repetitive tasks that consume valuable developer time daily. By integrating Copilot into my workflow, I’ve watched it suggest entire functions and fill critical logic gaps, dramatically reducing the boilerplate code that once dominated my coding sessions. This has transformed how I work, allowing me to tackle complex architectural challenges and focus on innovation rather than implementation details.</p>
<p>I encourage you to <a href="https://github.com/copilot">explore these GitHub Copilot capabilities</a> in your own environment. The transformation in both individual productivity and team dynamics might surprise you.</p>
<p>Happy coding!</p>
<p></p>
</body></html>
<p>The post <a href="https://github.blog/ai-and-ml/github-copilot/the-ai-powered-devops-revolution-redefining-developer-collaboration/">The AI-Powered DevOps revolution: Redefining developer collaboration</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Gemma explained: What’s new in Gemma 3 - Google Developers Bloghttps://developers.googleblog.com/en/gemma-explained-whats-new-in-gemma-3/2025-04-30T21:19:03.000ZGemma 3's new features include vision-language capabilities and architectural changes for improved memory efficiency and longer context handling compared to previous Gemma models.From MCP to multi-agents: The top 10 open source AI projects on GitHub right now and why they matter - The GitHub Bloghttps://github.blog/?p=870572025-04-30T16:00:49.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Every day, new public and open source repositories appear on GitHub, and navigating the sheer amount of activity can be a challenge for the best of us. Luckily, we’ve done the heavy lifting for you.</p>
<p>Together with our panel of GitHub experts—who have experience across open source and developer relations—we analyzed every open source project created in the last 99 days (as of March 29, 2025), and ranked them in consideration of a number of factors including stars-per-day, forks, traffic spikes, and contributor velocity.</p>
<p>Our GitHub panel includes:</p>
<ul>
<li><a href="mailto:abbycabs@github.com">Abigail Cabunoc Mayes</a>, aka @abbycabs, who works on open source maintainer programs and serves as a Director on the OpenJS foundation. </li>
<li><a href="mailto:karasowles@github.com">Kara Sowles</a>, aka @karasowles, who works with maintainers and the open source community. </li>
<li><a href="mailto:kevincrosby@github.com">Kevin Crosby</a>, aka @kevincrosby, who runs GitHub’s open source funding program. </li>
<li><a href="mailto:jeffrey-luszcz@github.com">Jeff Luszcz</a>, aka @jeffrey-luszcz, who helps manage GitHub’s open source program office (OSPO). </li>
</ul>
<p>Below, we’ll give you a rundown of these projects—and also discuss why we believe they’re the ones developers keep coming back to. Let’s dive in.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Top trends, at a glance</p><ul>
<li><strong>Agents are becoming key:</strong> A year ago the question was “What model can I fine tune?” Now it’s “What agent can I put to work?”</li>
<li><strong><a href="https://github.blog/ai-and-ml/llms/what-the-heck-is-mcp-and-why-is-everyone-talking-about-it/">Model Context Protocol (MCP)</a> is helping integrate AI (and becoming the USB‑C of AI tooling):</strong> More projects are exposing their functions via MCP so any LLM can call them.</li>
<li><strong>Multi‑agent orchestration is no longer research only:</strong> Frameworks like OWL let several specialized agents cooperate on a task.</li>
<li><strong>Speech generation is leveling up:</strong> Projects push TTS/STT beyond “read this text aloud” into precise duration control and natural rhythm and sound.</li>
<li><strong>An increasing number of experiments with digital twins:</strong> There’s interest in personal AI that carries your context and voice across apps.</li>
</ul>
<p><a href="https://github.blog/open-source/maintainers/from-mcp-to-multi-agents-the-top-10-open-source-ai-projects-on-github-right-now-and-why-they-matter/#what-these-projects-tell-us-about-ais-evolution-in-open-source">Get the full analysis ></a></p>
</aside>
</p><h2 id="1-open-webui-mcp-simplifying-ai-tool-integrations-%f0%9f%94%8c" id="1-open-webui-mcp-simplifying-ai-tool-integrations-%f0%9f%94%8c" >1. Open WebUI MCP: simplifying AI tool integrations 🔌<a href="#1-open-webui-mcp-simplifying-ai-tool-integrations-%f0%9f%94%8c" class="heading-link pl-2 text-italic text-bold" aria-label="1. Open WebUI MCP: simplifying AI tool integrations 🔌"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg d-block mr-2 width-full width-md-auto" href="https://docs.openwebui.com/" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24"><path d="M9.5 15.584V8.416a.5.5 0 01.77-.42l5.576 3.583a.5.5 0 010 .842l-5.576 3.584a.5.5 0 01-.77-.42z"></path><path fill-rule="evenodd" d="M12 2.5a9.5 9.5 0 100 19 9.5 9.5 0 000-19zM1 12C1 5.925 5.925 1 12 1s11 4.925 11 11-4.925 11-11 11S1 18.075 1 12z"></path></svg>
Website </a>
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/open-webui/mcpo" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/open-webui" target="_blank" arial-label="@open-webui" title="@open-webui">
<img decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/158137808?v=4&s=64" alt="@open-webui">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 MIT license</strong></p>
<p>First up is a proxy server that turns MCP tools into OpenAPI-compatible HTTP servers. Developers building AI-powered apps are using the server to easily connect MCP-based tools with anything that uses standard RESTful OpenAPI interfaces.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>“This new project from OpenWebUI (an alumni of 2024 GitHub Accelerator) is a great example of a growing trend in AI around integration—especially its use of MCP,” explains Abigail. “It’s highlighting that people in AI need more integration, and more standards like MCP will help.”</p>
</aside>
<h2 id="2-unbody-the-supabase-of-ai-%f0%9f%a7%a9" id="2-unbody-the-supabase-of-ai-%f0%9f%a7%a9" >2. Unbody: the “Supabase of AI” 🧩<a href="#2-unbody-the-supabase-of-ai-%f0%9f%a7%a9" class="heading-link pl-2 text-italic text-bold" aria-label="2. Unbody: the “Supabase of AI” 🧩"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<a href="https://github.blog/wp-content/uploads/2025/04/chatbot-mobile-website-smartwatch.png"><img data-recalc-dims="1" fetchpriority="high" decoding="async" src="https://github.blog/wp-content/uploads/2025/04/chatbot-mobile-website-smartwatch.png?resize=761%2C488" alt="A diagram showing chatbot, mobile, website, and smartwatch feeding into API, which then feeds into data." width="761" height="488" class="alignnone size-full wp-image-87068 width-fit" srcset="https://github.blog/wp-content/uploads/2025/04/chatbot-mobile-website-smartwatch.png?w=761 761w, https://github.blog/wp-content/uploads/2025/04/chatbot-mobile-website-smartwatch.png?w=300 300w" sizes="(max-width: 761px) 100vw, 761px" /></a><br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg d-block mr-2 width-full width-md-auto" href="https://unbody.io/" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24"><path d="M9.5 15.584V8.416a.5.5 0 01.77-.42l5.576 3.583a.5.5 0 010 .842l-5.576 3.584a.5.5 0 01-.77-.42z"></path><path fill-rule="evenodd" d="M12 2.5a9.5 9.5 0 100 19 9.5 9.5 0 000-19zM1 12C1 5.925 5.925 1 12 1s11 4.925 11 11-4.925 11-11 11S1 18.075 1 12z"></path></svg>
Website </a>
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/unbody-io/unbody" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3178c6"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
TypeScript </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/unbody-io" target="_blank" arial-label="@unbody-io" title="@unbody-io">
<img decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/100227567?v=4&s=64" alt="@unbody-io">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 Apache 2.0 license</strong></p>
<p>Think <a href="https://supabase.com/">Supabase</a>, but for AI: that’s Unbody in a nutshell. It’s a modular backend that lets you build AI-native software that actually <em>understands</em> and <em>reasons</em> about knowledge, instead of just shuffling data around.</p>
<p>The project breaks things down into four layers that you can mix and match:</p>
<ol>
<li><strong>Perception</strong>: Ingests, parses, enhances, and vectorizes raw data. </li>
<li><strong>Memory</strong>: Stores structured knowledge in vector databases and persistent storage. </li>
<li><strong>Reasoning</strong>: Generates content, calls functions, and plans actions. </li>
<li><strong>Action</strong>: Exposes knowledge via APIs, SDKs, and triggers.</li>
</ol>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>“It’s an interesting question: How does agent coding become more abstracted from the backend?” asks Kevin. “With Unbody, you can write in any framework and get a backend that’s automatically managed. If you look at companies like E2E, you see work being done to create much more advanced agents that show how the backend stack is being abstracted.”</p>
</aside>
<h2 id="3-owl-multi-agent-collaboration-in-action-%f0%9f%a6%89" id="3-owl-multi-agent-collaboration-in-action-%f0%9f%a6%89" >3. OWL: multi-agent collaboration in action 🦉<a href="#3-owl-multi-agent-collaboration-in-action-%f0%9f%a6%89" class="heading-link pl-2 text-italic text-bold" aria-label="3. OWL: multi-agent collaboration in action 🦉"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<a href="https://github.blog/wp-content/uploads/2025/04/owl_architecture.png"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?resize=1024%2C576" alt="An image showing an OWL System Architecture." width="1024" height="576" class="alignnone size-full wp-image-87081 width-fit" srcset="https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=3200 3200w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=1024 1024w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=1536 1536w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=2048 2048w, https://github.blog/wp-content/uploads/2025/04/owl_architecture.png?w=3000 3000w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a><br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/camel-ai/owl" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/camel-ai" target="_blank" arial-label="@camel-ai" title="@camel-ai">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/134388954?v=4&s=64" alt="@camel-ai">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 Apache 2.0 license</strong></p>
<p>When one AI agent isn’t enough, OWL enters the chat. Built on the <a href="https://www.camel-ai.org">CAMEL-AI</a> framework—best known for popularizing multi-agent role-play and releasing a trove of synthetic “task + data” bundles—<strong>OWL lets several specialized agents cooperate through browsers, terminals, function calls, and MCP tools</strong>. It even tops the open-source leaderboard on the GAIA benchmark (58.18).</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>“It’s not just agentic, it’s also multi-model—and multi-agent architectures like OWL,” notes Abigail. “A year ago, it was all about people building models—now it’s all about agents and what they can do. OWL is doing multi-agent work which is quickly emerging.”</p>
</aside>
<h2 id="4-f-mcptools-command-line-power-for-mcp-developers-%f0%9f%92%bb" id="4-f-mcptools-command-line-power-for-mcp-developers-%f0%9f%92%bb" >4. F/mcptools: Command-line power for MCP developers 💻<a href="#4-f-mcptools-command-line-power-for-mcp-developers-%f0%9f%92%bb" class="heading-link pl-2 text-italic text-bold" aria-label="4. F/mcptools: Command-line power for MCP developers 💻"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<a href="https://github.blog/wp-content/uploads/2025/04/mcp-tools.png"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?resize=1024%2C659" alt="A screenshot showing MCP Tools Shell." width="1024" height="659" class="alignnone size-full wp-image-87082 width-fit" srcset="https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?w=1944 1944w, https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?w=1024 1024w, https://github.blog/wp-content/uploads/2025/04/mcp-tools.png?w=1536 1536w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a><br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/f/mcptools" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#00ADD8"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Go </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/f" target="_blank" arial-label="@f" title="@f">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/196477?v=4&s=64" alt="@f">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 MIT license</strong></p>
<p>CLI fans: Here’s a command-line interface for working with MCP servers that’ll make you feel right at home. Built by <a href="https://stars.github.com/profiles/f/">GitHub Star Fatih Kadir Akin</a> (who also shipped the <a href="https://docs.github.com/en/copilot/using-github-copilot/copilot-chat/prompt-engineering-for-copilot-chat">GitHub Copilot prompts feature</a>), it lets you discover and call tools, access resources, and manage prompts from any MCP-compatible server.</p>
<p>MCP Tools supports input/output over stdin/stdout or HTTP, and spits out results in JSON or table views. It even lets you create mock servers for testing or proxy MCP requests to shell scripts.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>Why it matters: This turns MCP into something you can “git clone && mcp call.” It offers a familiar CLI workflow plus a built-in guard mode that lets you prototype tools fast <em>and</em> lock them down for prod.</p>
</aside>
<h2 id="5-nutlope-self-so-build-your-personal-site-with-ai-in-seconds-%e2%9a%a1" id="5-nutlope-self-so-build-your-personal-site-with-ai-in-seconds-%e2%9a%a1" >5. Nutlope/self.so: Build your personal site with AI in seconds ⚡<a href="#5-nutlope-self-so-build-your-personal-site-with-ai-in-seconds-%e2%9a%a1" class="heading-link pl-2 text-italic text-bold" aria-label="5. Nutlope/self.so: Build your personal site with AI in seconds ⚡"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<a href="https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png?resize=1024%2C538" alt="A screenshot of Self.so turning a LinkedIn profile into a website." width="1024" height="538" class="alignnone size-full wp-image-87069 width-fit" srcset="https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png?w=1200 1200w, https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/LinkedIn-to-Website.png?w=1024 1024w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a><br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg d-block mr-2 width-full width-md-auto" href="https://www.self.so/EB" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24"><path d="M9.5 15.584V8.416a.5.5 0 01.77-.42l5.576 3.583a.5.5 0 010 .842l-5.576 3.584a.5.5 0 01-.77-.42z"></path><path fill-rule="evenodd" d="M12 2.5a9.5 9.5 0 100 19 9.5 9.5 0 000-19zM1 12C1 5.925 5.925 1 12 1s11 4.925 11 11-4.925 11-11 11S1 18.075 1 12z"></path></svg>
Website </a>
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/Nutlope/self.so" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3178c6"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
TypeScript </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/Nutlope" target="_blank" arial-label="@Nutlope" title="@Nutlope">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/63742054?v=4&s=64" alt="@Nutlope">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 MIT license</strong></p>
<p>If putting together a personal website isn’t your idea of fun, Nutlope/self.so can lend a hand. Upload your résumé or LinkedIn profile, and the tool will pull together a straightforward site for you, using AI to handle the layout so you can skip the CSS headaches.</p>
<p>The tech stack includes Together.ai for language modeling, Vercel’s AI SDK, Clerk for auth, Next.js for the framework, Helicone for observability, S3 for storage, Upstash Redis for the database, and Vercel for hosting.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>The project speaks to composable “AI Lego” stacks. That’s because it chains Vercel AI SDK, Clerk auth, Upstash Redis, S3, and Tailwind—with each service doing one job well—which illustrates how modern AI apps are often stitched together from small, specialized services instead of monoliths.</p>
</aside>
<h2 id="6-voicestar-precise-control-for-text-to-speech-applications-%f0%9f%8e%99%ef%b8%8f" id="6-voicestar-precise-control-for-text-to-speech-applications-%f0%9f%8e%99%ef%b8%8f" >6. VoiceStar: precise control for text-to-speech applications 🎙️<a href="#6-voicestar-precise-control-for-text-to-speech-applications-%f0%9f%8e%99%ef%b8%8f" class="heading-link pl-2 text-italic text-bold" aria-label="6. VoiceStar: precise control for text-to-speech applications 🎙️"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/jasonppy/VoiceStar" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/jasonppy" target="_blank" arial-label="@jasonppy" title="@jasonppy">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/47729801?v=4&s=64" alt="@jasonppy">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 MIT code license, CC-BY-4.0 model license</strong></p>
<p>If your project needs speech that lands within a specific time window, VoiceStar’s duration-controllable synthesis can help. It lets developers set target lengths so voice output fits time-sensitive use cases—like fixed-length prompts or narration—without extra audio editing.</p>
<p>The project includes both CLI and Gradio interfaces for inference, plus pre-trained models you can use right away. As voice interfaces become more important in apps, having open source models with this level of control is a helpful step forward</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>It’s an open TTS model that lets you pin speech to an exact duration—which is helpful for dubbing, ads, and accessibility overlays where every millisecond counts. This is AI-assisted broadcast-grade timing from the open source community.</p>
</aside>
<h2 id="7-create-your-digital-twin-with-second-me-%f0%9f%a4%96" id="7-create-your-digital-twin-with-second-me-%f0%9f%a4%96" >7. Create your digital twin with Second-Me 🤖<a href="#7-create-your-digital-twin-with-second-me-%f0%9f%a4%96" class="heading-link pl-2 text-italic text-bold" aria-label="7. Create your digital twin with Second-Me 🤖"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<a href="https://github.blog/wp-content/uploads/2025/04/secondme.png"><img data-recalc-dims="1" loading="lazy" decoding="async" src="https://github.blog/wp-content/uploads/2025/04/secondme.png?resize=1024%2C500" alt="A screenshot of Second Me." width="1024" height="500" class="alignnone size-full wp-image-87084 width-fit" srcset="https://github.blog/wp-content/uploads/2025/04/secondme.png?w=2940 2940w, https://github.blog/wp-content/uploads/2025/04/secondme.png?w=300 300w, https://github.blog/wp-content/uploads/2025/04/secondme.png?w=768 768w, https://github.blog/wp-content/uploads/2025/04/secondme.png?w=1024 1024w, https://github.blog/wp-content/uploads/2025/04/secondme.png?w=1536 1536w, https://github.blog/wp-content/uploads/2025/04/secondme.png?w=2048 2048w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /></a><br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg d-block mr-2 width-full width-md-auto" href="https://www.secondme.io/" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="24" height="24"><path d="M9.5 15.584V8.416a.5.5 0 01.77-.42l5.576 3.583a.5.5 0 010 .842l-5.576 3.584a.5.5 0 01-.77-.42z"></path><path fill-rule="evenodd" d="M12 2.5a9.5 9.5 0 100 19 9.5 9.5 0 000-19zM1 12C1 5.925 5.925 1 12 1s11 4.925 11 11-4.925 11-11 11S1 18.075 1 12z"></path></svg>
Website </a>
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/mindverse/Second-Me" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/mindverse" target="_blank" arial-label="@mindverse" title="@mindverse">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/98323155?v=4&s=64" alt="@mindverse">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 Apache 2.0 license</strong></p>
<p>Interested in experimenting with an AI stand-in? Second-Me lets you try a basic “digital twin”—an agent that aims to reflect some of your knowledge, communication style, and preferences.</p>
<p>The possibilities range from personal assistants that actually understand how you think to innovative ways to share your expertise with others. One example of Second-Me in action: Have <a href="https://secondme.gitbook.io/secondme/getting-started#second-x-apps">your digital twin manage your LinkedIn or Airbnb account</a>, playing the role of the professional or host.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>This is a prime example of the shift we’re seeing from models to agents. “If we looked a year ago, it was all about model creation. ‘How do you do X or Y?’ You’d make a new model,” explains Jeff. “This project and others on the show a shift towards agentic motions and how people are using AI to do things.”</p>
</aside>
<h2 id="8-sesameailabs-csm-reimagining-speech-synthesis-%f0%9f%94%8a" id="8-sesameailabs-csm-reimagining-speech-synthesis-%f0%9f%94%8a" >8. SesameAILabs/csm: reimagining speech synthesis 🔊<a href="#8-sesameailabs-csm-reimagining-speech-synthesis-%f0%9f%94%8a" class="heading-link pl-2 text-italic text-bold" aria-label="8. SesameAILabs/csm: reimagining speech synthesis 🔊"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/SesameAILabs/csm" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/SesameAILabs" target="_blank" arial-label="@SesameAILabs" title="@SesameAILabs">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/136667101?v=4&s=64" alt="@SesameAILabs">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 Apache 2.0 code license (model has <a href="https://github.com/SesameAILabs/csm?tab=readme-ov-file#misuse-and-abuse-%EF%B8%8F">restrictions</a> on Abuse)</strong></p>
<p>The Conversational Speech Model (CSM) brings a fresh approach to speech generation. It converts text and audio inputs into Residual Vector Quantization (RVQ) audio codes using a Llama-based architecture. Its dedicated audio decoder produces Mimi audio codes that result in surprisingly natural-sounding speech.</p>
<p>What’s interesting here is how CSM merges language model architecture with specialized audio decoding—giving you an open alternative to the proprietary text-to-speech options that dominate the market.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>CSM fuses a Llama-based text backbone with a lightweight audio decoder that outputs Mimi RVQ codes—proving multimodal mash-ups can run locally on a single GPU. This tells us model-level multimodality is starting to take on API chains while permissive Apache-2.0 licensing is accelerating community R&D on billion-parameter speech systems.</p>
</aside>
<h2 id="9-letta-a-universal-standard-for-portable-ai-agents-%f0%9f%93%a6" id="9-letta-a-universal-standard-for-portable-ai-agents-%f0%9f%93%a6" >9. Letta: a universal standard for portable AI agents 📦<a href="#9-letta-a-universal-standard-for-portable-ai-agents-%f0%9f%93%a6" class="heading-link pl-2 text-italic text-bold" aria-label="9. Letta: a universal standard for portable AI agents 📦"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/letta-ai/agent-file" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/letta-ai" target="_blank" arial-label="@letta-ai" title="@letta-ai">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/177780362?v=4&s=64" alt="@letta-ai">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 Apache 2.0 license</strong></p>
<p>Letta introduces an open file format (.af) for packaging up AI agents with their memory and behavior intact. Think of it as a portable container for agents. You can share, checkpoint, and version control them across different frameworks.</p>
<p>For developers juggling multiple agent frameworks, this could be a time-saver. Want to move an agent from one system to another without rebuilding it from scratch? That’s the problem Letta is solving.</p>
<p>Notably, the Letta project is an offshoot of <a href="https://github.com/cpacker/memgpt">the cpacker/memgpt project</a>—it spun out the serialization layer that MemGPT originally used to snapshot its “virtual-context” agents. The team carved that code into a clean, framework-agnostic spec (agent-file) so any stack—MemGPT/Letta, LangGraph, CrewAI, you name it—can import or export a fully stateful agent with a single .af archive.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>Think “Docker image for AI agents.” The .af spec snapshots memory, tools, and prompts so you can version-control, share, and hot-swap agents across frameworks (MemGPT, LangGraph, CrewAI, etc.), solving the “how do I move my agent?” headache.</p>
</aside>
<h2 id="10-blender-meets-claude-bridging-3d-creation-and-ai-%f0%9f%8e%a8" id="10-blender-meets-claude-bridging-3d-creation-and-ai-%f0%9f%8e%a8" >10. Blender meets Claude: bridging 3D creation and AI 🎨<a href="#10-blender-meets-claude-bridging-3d-creation-and-ai-%f0%9f%8e%a8" class="heading-link pl-2 text-italic text-bold" aria-label="10. Blender meets Claude: bridging 3D creation and AI 🎨"></a></h2>
<div class="project-bar mt-5 mt-md-7 mb-5 mb-md-7"> <br />
<div class="d-flex flex-row flex-wrap flex-items-center">
<div class="d-flex flex-row flex-items-center width-full width-md-auto mt-2">
<a class="btn-mktg btn-small-mktg btn-muted-mktg d-block mr-md-2 width-full width-md-auto" href="https://github.com/ahujasid/blender-mcp" target="_blank">
<svg class="octicon d-inline-block mr-1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"></path></svg>
Source </a>
</div>
<div class="d-flex flex-row flex-items-center mt-2" style="flex-grow: 1;">
<div class="d-flex flex-row flex-wrap text-semibold f5-mktg">
<span class="d-flex flex-row flex-items-center mr-1 my-1">
<svg class="d-block" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" width="16" height="16" fill="#3572A5"><path fill-rule="evenodd" d="M8 4a4 4 0 100 8 4 4 0 000-8z"></path></svg>
Python </span>
</div>
<div class="d-flex flex-row flex-wrap ml-auto f5-mktg">
<div class="my-1 ml-2">
<a href="https://github.com/ahujasid" target="_blank" arial-label="@ahujasid" title="@ahujasid">
<img loading="lazy" decoding="async" class="d-block circle height-auto" width="24" height="24" src="https://avatars.githubusercontent.com/u/11807284?v=4&s=64" alt="@ahujasid">
</a>
</div>
</div>
</div>
</div>
</div>
<p><strong>📜 MIT license</strong></p>
<p>Blender artists, this one’s for you: a third party tool that connects the popular open source 3D creation suite Blender with Claude AI through the MCP. With Blender-MCP, developers can control Blender operations with natural language—or add AI assistance to their 3D workflow.</p>
<p>Blender-MCP shows how the <strong>MCP can act as a universal “tool port”</strong> for LLM agents: today it’s Blender; tomorrow it could be Unity, Unreal, or any complex desktop app. For 3D artists and prototypers, that means faster scene blocking, easy style experiments, and a brand-new way to teach beginners. Just describe what you want and watch the software build it.</p>
<p>Interested? Installing Blender-MCP is as simple as running a bash command.</p>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">Why it matters</p><p>This shows how MCP can wire LLMs into heavyweight desktop apps—in this case, giving Claude the keys to Blender for natural-language scene creation and asset management. Its rapid growth hints that the next UX leap for 3-D (and maybe CAD, Unity, Unreal) might be chat-driven.</p>
</aside>
<h2 id="what-these-projects-tell-us-about-ais-evolution-in-open-source" id="what-these-projects-tell-us-about-ais-evolution-in-open-source" >What these projects tell us about AI’s evolution in open source<a href="#what-these-projects-tell-us-about-ais-evolution-in-open-source" class="heading-link pl-2 text-italic text-bold" aria-label="What these projects tell us about AI’s evolution in open source"></a></h2>
<p>These patterns not only reflect the current state of AI in open source, but also hint at the challenges and opportunities that lie ahead. Here’s what GitHub experts have to say about the rapidly evolving space:</p>
<h3 id="integration-in-ai-via-mcp-is-the-new-frontier-%f0%9f%94%97" id="integration-in-ai-via-mcp-is-the-new-frontier-%f0%9f%94%97" >Integration in AI via MCP is the new frontier 🔗<a href="#integration-in-ai-via-mcp-is-the-new-frontier-%f0%9f%94%97" class="heading-link pl-2 text-italic text-bold" aria-label="Integration in AI via MCP is the new frontier 🔗"></a></h3>
<p>The prominence of MCP across multiple projects highlights the growing importance of standardized integration patterns in AI development.</p>
<p>“A big pattern that I saw is the pain point around AI and integration,” notes Abigail. “More standards like MCP will help with this.”</p>
<h3 id="multi-agent-collaboration-emerges-%f0%9f%91%a5" id="multi-agent-collaboration-emerges-%f0%9f%91%a5" >Multi-agent collaboration emerges 👥<a href="#multi-agent-collaboration-emerges-%f0%9f%91%a5" class="heading-link pl-2 text-italic text-bold" aria-label="Multi-agent collaboration emerges 👥"></a></h3>
<p>Projects like OWL point to a future where multiple specialized AI agents work together to solve complex problems.</p>
<p>“You have to think about it in the construct of person to person, agent to agent, and then having multiple agents working in tandem,” explains Kevin, highlighting the complexity and potential of this approach.</p>
<h3 id="speech-generation-is-advancing-%f0%9f%97%a3%ef%b8%8f" id="speech-generation-is-advancing-%f0%9f%97%a3%ef%b8%8f" >Speech generation is advancing 🗣️<a href="#speech-generation-is-advancing-%f0%9f%97%a3%ef%b8%8f" class="heading-link pl-2 text-italic text-bold" aria-label="Speech generation is advancing 🗣️"></a></h3>
<p>Speech tech certainly isn’t new—but large language models are reshaping both <strong>text-to-speech (TTS)</strong> and <strong>speech-to-text (STT)</strong> so dramatically that a fresh wave of possibilities is opening up.</p>
<p>The bigger story is what this means downstream with implications across media, customer support, and product UX.</p>
<h3 id="the-evolving-landscape-of-open-source-participation-%f0%9f%8c%b1" id="the-evolving-landscape-of-open-source-participation-%f0%9f%8c%b1" >The evolving landscape of open source participation 🌱<a href="#the-evolving-landscape-of-open-source-participation-%f0%9f%8c%b1" class="heading-link pl-2 text-italic text-bold" aria-label="The evolving landscape of open source participation 🌱"></a></h3>
<p>AI has drawn a fresh wave of maintainers and contributors to open source, bringing new energy and approaches. Kara Deloss, senior program manager of developer relations at GitHub, notes: “We’re seeing a new generation, or a new type of maintainer” in the AI space. Kevin adds that “if you have a big community on day one, that’s valuable,” highlighting how the ecosystem is evolving.</p>
<p>This blend of established and emerging development practices is creating exciting opportunities for collaboration across the community.</p>
<h3 id="the-importance-of-osi-approved-licenses-%f0%9f%93%84" id="the-importance-of-osi-approved-licenses-%f0%9f%93%84" >The importance of OSI-approved licenses 📄<a href="#the-importance-of-osi-approved-licenses-%f0%9f%93%84" class="heading-link pl-2 text-italic text-bold" aria-label="The importance of OSI-approved licenses 📄"></a></h3>
<p>Every top project in our list uses OSI-approved licenses (mostly MIT and Apache 2.0), and that’s no accident. “In general, you won’t get a lot of positive sentiment from the community if you call yourself open source but aren’t using an OSI-approved license,” Jeff points out. “These licenses matter because they provide clear guarantees around usage, modification, and redistribution rights that build trust in the community.”</p>
<p>Jeff also notes an emerging challenge: “As AI powered services and tools become more powerful, we are seeing a trend where some projects attach Abuse and Fraud related restrictions on model or service use. This may not make them completely open source under an OSI approved license, and the projects and the community will continue to have an intense conversation about these conditions.”</p>
<p>He continues: “It’s important to understand and document any restrictions in place before using a model or service,” which is a timely reminder as the open source AI community navigates these evolving licensing questions.</p>
<h2 id="explore-and-contribute-to-tomorrows-ai-tooling-%f0%9f%9b%a0%ef%b8%8f" id="explore-and-contribute-to-tomorrows-ai-tooling-%f0%9f%9b%a0%ef%b8%8f" >Explore and contribute to tomorrow’s AI tooling 🛠️<a href="#explore-and-contribute-to-tomorrows-ai-tooling-%f0%9f%9b%a0%ef%b8%8f" class="heading-link pl-2 text-italic text-bold" aria-label="Explore and contribute to tomorrow’s AI tooling 🛠️"></a></h2>
<p>The projects we’ve highlighted here are just the tip of the iceberg. As AI keeps evolving, the open source ecosystem is where many of the most exciting standards, tools, and techniques are popping up first.</p>
<p>Here’s how to get involved:</p>
<ul>
<li>Check out these projects to see how they might fit into your workflow </li>
<li>Join in and contribute to projects that spark your interest </li>
<li>Keep an eye on MCP and other emerging standards</li>
</ul>
<div class="post-content-cta"><p><a href="https://github.com/trending">Find more cutting-edge repos on GitHub’s trending page ></a></p>
</div>
<aside class="p-4 p-md-6 post-aside--large"><p class="h5-mktg gh-aside-title">📣 Call for Speakers: Git Merge 2025</p><p>Have a story, tool, or hard-won lesson that can level-up the Git community? Git Merge 2025 is now accepting talk proposals—especially from first-time speakers, maintainers, educators, and voices from under-represented groups. Submit your idea by <strong>May 13, 2025</strong> and help shape the future of distributed version control. 👉 <a href="https://sessionize.com/git-merge-2025/">Propose your talk ></a></p>
</aside>
</body></html>
<p>The post <a href="https://github.blog/open-source/maintainers/from-mcp-to-multi-agents-the-top-10-open-source-ai-projects-on-github-right-now-and-why-they-matter/">From MCP to multi-agents: The top 10 open source AI projects on GitHub right now and why they matter</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Announcing the general availability of Llama 4 MaaS on Vertex AI - Google Developers Bloghttps://developers.googleblog.com/en/llama-4-ga-maas-vertex-ai/2025-04-29T22:34:01.000ZLlama 4, Meta's advanced large language model, is now generally available as a fully managed API on Vertex AI, simplifying deployment and management. The Llama 3.3 70B managed API is also generally available, offering users greater flexibility.Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes - Engineering at Metahttps://engineering.fb.com/?p=223912025-04-29T17:15:17.000Z<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">We are introducing AutoPatchBench, a benchmark for the automated repair of vulnerabilities identified through fuzzing.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">By providing a standardized benchmark, AutoPatchBench enables researchers and practitioners to objectively evaluate and compare the effectiveness of various AI program repair systems. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">This initiative facilitates the development of more robust security solutions, and also encourages collaboration within the community to address the critical challenge of software vulnerability repair.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">AutoPatchBench is available now on </span><a href="https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks" target="_blank" rel="noopener"><span style="font-weight: 400;">GitHub.</span></a></li>
</ul>
<p><span style="font-weight: 400;">AI is increasingly being applied to solve security challenges, including repairing vulnerabilities identified through fuzzing. However, the lack of a standardized benchmark for objectively assessing AI-driven bug repair agents specific to fuzzing has impeded progress in academia and the broader community. Today, we are publicly releasing AutoPatchBench, a benchmark designed to evaluate AI program repair systems. AutoPatchBench sits within </span><a href="https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks" target="_blank" rel="noopener"><span style="font-weight: 400;">CyberSecEval 4</span></a><span style="font-weight: 400;">, Meta’s new benchmark suite for evaluating AI capabilities to support defensive use cases. It features 136 fuzzing-identified C/C++ vulnerabilities in real-world code repos along with verified fixes sourced from the </span><a href="https://arxiv.org/abs/2408.02153" target="_blank" rel="noopener"><span style="font-weight: 400;">ARVO dataset</span></a><span style="font-weight: 400;">. </span></p>
<p><span style="font-weight: 400;">AutoPatchBench provides a standardized evaluation framework for assessing the effectiveness of AI-assisted vulnerability repair tools. This benchmark aims to facilitate a comprehensive understanding of the capabilities and limitations of various AI-driven approaches to repairing fuzzing-found bugs. By offering a consistent set of evaluation criteria, AutoPatchBench fosters transparency and reproducibility in research, enabling both academic and industry professionals to identify best practices and areas for improvement.</span></p>
<h2><span style="font-weight: 400;">Fixing fuzzing-found vulnerabilities with AI</span></h2>
<p><span style="font-weight: 400;">Fuzzing is a cornerstone in automated testing, renowned for its effectiveness in uncovering security vulnerabilities. By bombarding a target program with vast amounts of pseudo-random input data, fuzz testing exposes critical security and reliability issues, such as memory corruption, invalid pointer dereference, integer overflow, and parsing errors. </span></p>
<p><span style="font-weight: 400;">However, resolving a fuzzing crash is often a labor intensive task, demanding intricate debugging and thorough code review to pinpoint and rectify the underlying cause. This process can be both time-consuming and resource-intensive. Unlike regular test failures, fuzzing bugs frequently reveal security vulnerabilities that pose severe threats to system integrity and user data. Given these stakes, automating the repair of fuzzing bugs with AI becomes not just advantageous but essential. AI’s ability to swiftly analyze patterns and propose solutions significantly reduces the time and effort required for repairs, making it an invaluable ally in safeguarding our digital environments.</span></p>
<p><span style="font-weight: 400;">Let’s explore the process of addressing bugs identified through fuzzing by examining a demonstrative example. Consider the following C function, which harbors a read/write buffer overflow vulnerability:</span></p>
<pre class="line-numbers"><code class="language-cpp">#include <stdio.h>
#include <string.h>
void process_input(const char *input) {
char buffer[8];
strcpy(buffer, input); // Potential buffer overflow
printf("Processed: %s\n", buffer);
}
</code></pre>
<p><span style="font-weight: 400;">In this scenario, a fuzzing harness might supply an </span><span style="font-weight: 400; color: #008000; font-family: 'courier new', courier;">input</span><span style="font-weight: 400;"> that surpasses the buffer’s capacity, leading to a crash due to buffer overflow. A typical stack trace from such a crash might appear as follows:</span></p>
<pre class="line-numbers"><code class="language-none">== Fuzzer Crash Report ==
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7af1223 in strcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff7af1223 in strcpy ()
#1 0x0000555555555140 in process_input (input=0x7fffffffe695 "AAAAAA...")
#2 0x0000555555555162 in main (argc=2, argv=0x7fffffffe5f8)</code></pre>
<p><span style="font-weight: 400;">Here, the </span><span style="font-weight: 400; color: #008000; font-family: 'courier new', courier;">process_input</span><span style="font-weight: 400;"> function invokes </span><span style="font-weight: 400; color: #008000; font-family: 'courier new', courier;">strcpy</span><span style="font-weight: 400;"> on a string that exceeds the eight-character buffer, causing a segmentation fault. A straightforward patch involves ensuring the copy operation remains within the buffer’s limits. This can be achieved by using a bounded copy function like </span><span style="font-weight: 400; font-family: 'courier new', courier; color: #008000;">strncpy</span><span style="font-weight: 400;"> or implementing a length check before copying:</span></p>
<pre class="line-numbers"><code class="language-cpp">void process_input(const char *input) {
char buffer[8];
strncpy(buffer, input, sizeof(buffer) - 1);
buffer[sizeof(buffer) - 1] = '\0';
printf("Processed: %s\n", buffer);
}
</code></pre>
<p><span style="font-weight: 400;">This patch ensures that the string remains within the buffer’s limits, effectively preventing out-of-bounds writes. Its correctness can be confirmed by verifying that the fuzzing input, which previously caused the crash, no longer does so. Additional checks can be conducted to ensure the patch doesn’t introduce any unintended side effects.</span></p>
<p><span style="font-weight: 400;">As illustrated, fixing a fuzzing crash involves:</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Analyzing the crash stack trace and the target code. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Pinpointing the root cause. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Patching the vulnerable code. </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Verifying the fix’s accuracy. </span></li>
</ol>
<p><span style="font-weight: 400;">An AI-based solution can automate these steps by utilizing an LLM’s capability to understand and generate code.</span></p>
<h2><span style="font-weight: 400;">Why we developed AutoPatchBench</span></h2>
<p><span style="font-weight: 400;">AutoPatchBench is informed by key advancements in the field of AI-driven program repair, particularly those focusing on fuzzing-found vulnerabilities. Among the notable contributions is Google’s tech report on </span><a href="https://research.google/pubs/ai-powered-patching-the-future-of-automated-vulnerability-fixes/" target="_blank" rel="noopener"><span style="font-weight: 400;">AI-powered patching</span></a><span style="font-weight: 400;">, which pioneered the use of LLMs for addressing fuzzing crashes, achieving a 15% fix rate with their proprietary dataset. Subsequently, </span><a href="https://arxiv.org/abs/2501.07531" target="_blank" rel="noopener"><span style="font-weight: 400;">Google’s study on generic program repair agents</span></a><span style="font-weight: 400;"> introduced the GITS-Eval benchmark, encompassing 178 bugs across various programming languages. </span></p>
<p><span style="font-weight: 400;">In the realm of AI software engineering agents, benchmarks like </span><a href="https://www.swebench.com/" target="_blank" rel="noopener"><span style="font-weight: 400;">SWE-Bench</span></a><span style="font-weight: 400;"> and </span><a href="https://openai.com/index/introducing-swe-bench-verified/" target="_blank" rel="noopener"><span style="font-weight: 400;">SWE-Bench Verified</span></a><span style="font-weight: 400;"> have gained widespread acceptance for evaluating generic AI SWE agents. However, these benchmarks do not specifically tackle the unique challenges posed by fuzzing-found vulnerabilities, which demand specialized approaches that utilize fuzzing-specific artifacts and address security concerns. </span></p>
<p><span style="font-weight: 400;">AutoPatchBench addresses this gap by offering a dedicated benchmark focused on a wide variety of C/C++ vulnerabilities of 11 crash types identified through fuzzing with automated verification capability. Unlike the broader focus of GITS-Eval and SWE-Bench, AutoPatchBench is specifically designed to assess the effectiveness of AI-driven tools in repairing security-critical bugs typically uncovered by fuzzing. This targeted approach enables a more precise evaluation of AI capabilities in meeting the complex requirements of fuzzing-found vulnerabilities, thereby advancing the field of AI-assisted program repair in a focused manner.</span></p>
<h2><span style="font-weight: 400;">Inside AutoPatchBench</span></h2>
<p><span style="font-weight: 400;">We’re making AutoPatchBench </span><a href="https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks" target="_blank" rel="noopener"><span style="font-weight: 400;">publicly available</span></a><span style="font-weight: 400;"> as part of CyberSecEval 4 to encourage community collaboration in tackling the challenge of automating fuzzing crash repairs. This benchmark is specifically designed for AI program repair agents focusing on C/C++ bugs identified through fuzzing. It includes real-world C/C++ vulnerabilities with verified fixes sourced from the </span><a href="https://arxiv.org/abs/2408.02153" target="_blank" rel="noopener"><span style="font-weight: 400;">ARVO dataset</span></a><span style="font-weight: 400;">, and incorporates additional verification of AI-generated patches through fuzzing and white-box differential testing.</span></p>
<h3><span style="font-weight: 400;">ARVO dataset</span></h3>
<p><span style="font-weight: 400;">The ARVO dataset serves as the foundation for AutoPatchBench, offering a comprehensive collection of real-world vulnerabilities that are essential for advancing AI-driven security research. Sourced from C/C++ projects identified by Google’s OSS-Fuzz, ARVO includes over 5,000 reproducible vulnerabilities across more than 250 projects. Each entry is meticulously documented with a triggering input, a canonical developer-written patch, and the capability to rebuild the project in both its vulnerable and patched states. </span></p>
<p><span style="font-weight: 400;">However, there are notable challenges when using the ARVO dataset as a benchmark for AI patch generation:</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">While reproducibility is vital for a reliable benchmark, the ARVO dataset includes samples where crashes are not consistently reproducible. Some samples lack crash stack traces, making it exceedingly difficult to address the crash.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Although ARVO provides a ground-truth fix for each identified vulnerability, it lacks an automated mechanism to verify the correctness of a generated patch. Objective automated verification is essential for a benchmark focused on patch generation.</span></li>
</ol>
<p><span style="font-weight: 400;">AutoPatchBench addresses these challenges by creating a curated subset and by employing a comprehensive and automated verification process.</span></p>
<h3><span style="font-weight: 400;">Selection criteria</span></h3>
<p><span style="font-weight: 400;">To ensure the reliability and effectiveness of AutoPatchBench, we meticulously filtered the ARVO dataset samples based on the following criteria:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Valid C/C++ vulnerability:</b><span style="font-weight: 400;"> The ground-truth fix shall edit one or more C/C++ source files that are not fuzzing harnesses.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Dual-container setup</b><span style="font-weight: 400;">: Each vulnerability is accompanied by two containers—one that contains vulnerable code and another for the fixed code—that build without error.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Reproducibility</b><span style="font-weight: 400;">: The crash must be consistently reproducible within the vulnerable container.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Valid stack trace</b><span style="font-weight: 400;">: A valid stack trace must be present within the vulnerable container to facilitate accurate diagnosis and repair.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Successful compilation</b><span style="font-weight: 400;">: The vulnerable code must compile successfully within its designated container, ensuring that the environment is correctly set up for testing.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Fixed code verification</b><span style="font-weight: 400;">: The fixed code must also compile successfully within its respective container, confirming that the patch does not introduce new build issues.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Crash resolution</b><span style="font-weight: 400;">: The crash must be verified as resolved within the fixed container, demonstrating the effectiveness of the patch.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Fuzzing pass</b><span style="font-weight: 400;">: The fixed code must pass a comprehensive fuzzing test without finding new crashes, ensuring that the ground-truth patch maintains the integrity and functionality of the software.</span></li>
</ul>
<p><span style="font-weight: 400;">After applying these rigorous selection criteria, we retained 136 samples for AutoPatchBench that fulfill the necessary conditions for both patch generation and verification. From this refined set, we created a down-sampled subset of 113 AutoPatchBench-Lite samples to provide a focused benchmark for testing AI patch generation tools. These subsets preserves the diversity and complexity of real-world vulnerabilities including 11 distinct crash types, offering a solid foundation for advancing AI-driven security solutions.</span></p>
<h3><span style="font-weight: 400;">Patch verification</span></h3>
<p><span style="font-weight: 400;">In the process of patch generation, the patch generator utilizes two automated methods to verify the viability of a generated patch before submitting it for evaluation. The first method involves attempting to build the patched program, which checks for syntactic correctness. The second method involves attempting to reproduce the crash by running the input that initially triggered it. If the crash no longer occurs, it suggests that the issue has been resolved. However, these steps alone are insufficient to guarantee the correctness of the patch, as a patch might not maintain the program’s intended functionality, rendering it incorrect despite resolving the crash.</span></p>
<p><span style="font-weight: 400;">To address this issue, AutoPatchBench adopts a comprehensive approach to automate the evaluation of generated patches. This involves subjecting the patched code to further fuzz testing using the original fuzzing harness that initially detected the crash. Additionally, white-box differential testing compares the runtime behavior of the patched program against the ground truth repaired program, confirming that the patch has effectively resolved the underlying bug without altering the program’s intended functionality. Since a patch can potentially be made in multiple places, we cannot assume that the LLM will patch the same function as the groundtruth patch does. Instead we find all the callstacks for each call to a patched function. Then we find the lowest common ancestor (LCA) across all pairs of stacktraces offered by the groundtruth patch and the LLM patch. We then utilize debug information to inspect arguments, return values, and local variables at the first function above the LCA, differential testing offers a detailed view of the patch’s impact on the program state. </span></p>
<p><span style="font-weight: 400;">This process evaluates whether the generated patch produces a program state identical to the ground truth program after the patched function returns. By using a diverse set of inputs obtained from fuzzing, this gives higher confidence that the bug is fixed without changing the visible behavior of the patched functions. This differential testing is implemented using a Python script that leverages LLDB APIs to dump all visible states and identify differences between the ground truth and the patched program. </span></p>
<p><span style="font-weight: 400;">However, as with all attempts to solve provably undecidable problems (in this case: program equivalence), there are some failure modes for this verification step. For example, sometimes the analysis fails with timeouts, in which case we consider the semantics to be preserved if both the ground truth and the LLM patch timed out. Programs might also behave non-deterministically, and we run each input three times to identify nondeterministic struct fields and values. Such fields will not be compared to avoid false alarms from noisy, random values. Additionally, we strip any fields that contain the substring “build” or “time” as we’ve observed false positives from build-ids (that happen to be deterministic within a program, but not across different patches). </span></p>
<p><span style="font-weight: 400;">It should also be noted that on a number of examples, the crashing PoC never actually triggered the breakpoints on the ground truth patch, making comparison of the resulting states impossible. However, our case study showed that white-box differential testing is still effective in filtering out a majority of incorrect patches despite its limitation, which will be discussed in the case study.</span></p>
<h3><span style="font-weight: 400;">AutoPatchBench and AutoPatchBench-Lite</span></h3>
<p><span style="font-weight: 400;">AutoPatchBench is a comprehensive benchmark dataset of 136 samples. It encompasses a wide range of real-world vulnerabilities, providing a robust framework for assessing the capabilities of automated patch generation systems. </span></p>
<p><span style="font-weight: 400;">Within this benchmark, we have also created a subset called AutoPatchBench-Lite that consists of 113 samples. AutoPatchBench-Lite focuses on a simpler subset of vulnerabilities where the root cause of the crash is confined to a single function. This version is designed to cater to scenarios where the complexity of the bug is relatively low, making it more accessible for tools that are in the early stages of development or for those that specialize in handling straightforward issues.</span></p>
<p><span style="font-weight: 400;">The rationale for creating AutoPatchBench-Lite stems from the observation that when root causes are distributed across multiple locations within the code, the difficulty of generating a correct patch increases significantly. Addressing such “hard” crashes requires a tool to possess advanced reasoning capabilities to analyze larger codebases and apply patches to multiple areas simultaneously. This complexity not only challenges the tool’s design but also demands a higher level of sophistication in its algorithms to ensure accurate and effective patching.</span></p>
<p><span style="font-weight: 400;">By offering both AutoPatchBench and AutoPatchBench-Lite, we provide a tiered approach to benchmarking, allowing developers to progressively test and refine their tools. This structure supports the development of more advanced solutions capable of tackling both simple and complex vulnerabilities, ultimately contributing to the enhancement of AI-assisted bug repair techniques.</span></p>
<h3><span style="font-weight: 400;">Expected use cases</span></h3>
<p><span style="font-weight: 400;">AutoPatchBench offers significant value to a diverse range of users. Developers of auto-patch tools can leverage our open-sourced patch generator to enhance their tools and assess their effectiveness using the benchmark. Software projects employing fuzzing can incorporate our open-sourced patch generator to streamline vulnerability repair. Additionally, model developers can integrate the benchmark into their development cycles to build more robust and specialized expert models for bug repair. The tooling around the patch generator provided here can also be used in reinforcement learning as a reward signal during training. This data helps train models to better understand the nuances of bug repair, enabling them to learn from past fixes and improve their ability to generate accurate patches. </span></p>
<h2><span style="font-weight: 400;">Reference implementation</span></h2>
<p><span style="font-weight: 400;">We developed a basic patch generator to establish a baseline performance using AutoPatchBench. This generator is specifically designed to address simple crashes that involve patching a single function. We have </span><a href="https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks" target="_blank" rel="noopener"><span style="font-weight: 400;">open-sourced this reference implementation</span></a><span style="font-weight: 400;"> to encourage the community to build and expand upon it.</span></p>
<p><span style="font-weight: 400;">Figure 1 shows a high-level overview of its design. The patch generator takes crash stack trace and target source code as input. It identifies the source locations from the stack trace and extracts every function that contains those locations. It then asks the LLM to identify the root cause and repair the crash by patching one of the functions. Upon receiving a response from the LLM, the patch generator extracts the revised code, applies the patch, compiles the program, and tests it against the original input that caused the crash. If the build or test fails, we re-engage the LLM with the error message from the build or test output, requesting it to attempt a solution again until the crash is resolved. If a fix trajectory fails to reach a valid solution that passes build and crash reproduction within a finite number of steps, we start a new trajectory to reset the context window, preventing prolonged entrapment in an incorrect path.</span></p>
<figure id="attachment_22444" aria-describedby="caption-attachment-22444" style="width: 600px" class="wp-caption alignnone"><img fetchpriority="high" decoding="async" class="wp-image-22444" src="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?w=1024" alt="" width="600" height="519" srcset="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png 1600w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=916,792 916w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=768,664 768w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=1024,886 1024w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=1536,1329 1536w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=96,83 96w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-1.png?resize=192,166 192w" sizes="(max-width: 992px) 100vw, 62vw" /><figcaption id="caption-attachment-22444" class="wp-caption-text">Figure 1: Patch generation flowchart.</figcaption></figure>
<p><span style="font-weight: 400;">The example prompt below also captures the essence of our approach. Note that the actual prompt is more detailed and divided into multiple segments to promote chain-of-thought reasoning.</span></p>
<pre class="line-numbers"><code class="language-none">As an experienced Security Engineer at Meta, your task is to address the following security-critical fuzzing crash. Below is the stack trace of the crash:
== Fuzzer Crash Report ==
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7af1223 in strcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff7af1223 in strcpy ()
#1 0x0000555555555140 in process_input (input=0x7fffffffe695 "AAAAAA...")
#2 0x0000555555555162 in main (argc=2, argv=0x7fffffffe5f8)
Here is the source code for the functions involved in the stack trace:
strcpy() {...}
void process_input(const char *input) {
char buffer[8];
strcpy(buffer, input); // Potential buffer overflow
printf("Processed: %s\n", buffer);
}
int main() {...}
Assuming the root cause of the crash is within one of these functions, generate a patched version of the faulty function to resolve the fuzzing crash. Ensure that you provide a complete rewrite of the function so that the patch can be applied and the code compiled without errors.
</code></pre>
<h2><span style="font-weight: 400;">A case study with AutoPatchBench-Lite</span></h2>
<p><span style="font-weight: 400;">In the case study, we demonstrate the use of AutoPatchBench by evaluating our reference patch generator with several LLM models. Given that our reference implementation is limited to addressing simple issues, we conducted our evaluation with AutoPatchBench-Lite, which contains 113 samples. To prevent fix trajectories from becoming excessively prolonged, we capped the maximum length of each trajectory at five. Additionally, we set the maximum number of retries to 10. </span></p>
<p><i><span style="font-weight: 400;">Please note that the case study is not intended to provide a statistically rigorous comparison of model performance. Instead, it aims to present preliminary results to establish a baseline expectation. We encourage future research to build upon these findings.</span></i></p>
<h3><span style="font-weight: 400;">Effectiveness of patch generation and verification</span></h3>
<p><span style="font-weight: 400;">We evaluated the effectiveness of the patch generator and our automated verification processes while using different LLM models as back-end. The figure below illustrates the effectiveness of patch generation and verification by presenting the percentage of samples that successfully passed each sequential verification step: (1) patch validity: build and crash reproducibility check, (2) fuzzing pass: passes 10-minute fuzzing, and (3) testing pass: passes white-box differential testing. It is important to note that the patch generation process only utilizes step (1) to verify the build and crash reproducibility. The fuzzing and differential testing are conducted post-generation to assess correctness.</span></p>
<figure id="attachment_22443" aria-describedby="caption-attachment-22443" style="width: 1024px" class="wp-caption alignnone"><img decoding="async" class="size-large wp-image-22443" src="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?w=1024" alt="" width="1024" height="634" srcset="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png 1496w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?resize=916,567 916w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?resize=768,475 768w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?resize=1024,634 1024w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?resize=96,59 96w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-2.png?resize=192,119 192w" sizes="(max-width: 992px) 100vw, 62vw" /><figcaption id="caption-attachment-22443" class="wp-caption-text">Figure 2: Patch generation and verification success rate.</figcaption></figure>
<p><span style="font-weight: 400;">Figure 2 shows that all models achieved similar generation success rates of around 60% and similar post-verification success rates of around 5-11% with overlapping confidence intervals, and therefore, we do not draw any conclusion about their relative performance. The graph does, however, reveal that a substantial portion of the generated patches are found to be incorrect when subjected to fuzzing and white-box differential testing. For instance, Gemini 1.5 Pro achieved a 61.1% patch generation success rate, yet fewer than 15% of these patches (5.3% out of total set) were found to be correct. This gap highlights that build and crash reproduction are not good enough signals to infer the correctness of generated patches, and that future patch generation approaches should scrutinize the semantic preservation of generated patches more thoroughly. This gap also underscores the vital role of the comprehensive verification processes that checks semantic equivalence, a distinctive contribution of AutoPatchBench.</span></p>
<h3><span style="font-weight: 400;">Effect of inference-time computation</span></h3>
<p><span style="font-weight: 400;">To assess the impact of inference-time computation on improving the patch generation success rate, we present the distribution of retry counts among the 73 patches produced by Llama 4 Maverick.</span><span style="font-weight: 400;"><br />
</span></p>
<figure id="attachment_22469" aria-describedby="caption-attachment-22469" style="width: 1024px" class="wp-caption alignnone"><img decoding="async" class="size-large wp-image-22469" src="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?w=1024" alt="" width="1024" height="548" srcset="https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png 1386w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?resize=916,490 916w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?resize=768,411 768w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?resize=1024,548 1024w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?resize=96,51 96w, https://engineering.fb.com/wp-content/uploads/2025/04/AutoPatchBench-image-3-1.png?resize=192,103 192w" sizes="(max-width: 992px) 100vw, 62vw" /><figcaption id="caption-attachment-22469" class="wp-caption-text">Figure 3: Percentage of generated patches per number of iterations.</figcaption></figure>
<p><span style="font-weight: 400;">Figure 3 shows that 44 out of 73 patches, or 60.2%, were successfully generated on the first attempt. The remaining 40% of the samples required more than two iterations, with no evident plateau until the 10th iteration. This outcome demonstrates that allocating more computational resources during inference-time leads to a higher success rate and suggests that increasing the number of retries could yield better results.</span></p>
<h3><span style="font-weight: 400;">Manual validation</span></h3>
<p><span style="font-weight: 400;">In our investigation of the precision and recall of white-box differential testing, we conducted a manual validation of 44 patches that passed 10-minute fuzzing against human-written ground truth fixes with the help of security experts. These patches were selected from a pool of 73 generated by Llama 4 Maverick. The following table shows the confusion matrix.</span></p>
<p><span style="font-weight: 400;">Table 1: Confusion matrix between human judgement and differential testing</span></p>
<table style="height: 112px;" border="1" width="573">
<tbody>
<tr>
<td></td>
<td><span style="font-weight: 400;">Test pass</span></td>
<td><span style="font-weight: 400;">Test fail</span></td>
<td><span style="font-weight: 400;">Sum</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">Human pass</span></td>
<td><span style="font-weight: 400;">5</span></td>
<td><span style="font-weight: 400;">0</span></td>
<td><span style="font-weight: 400;">5</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">Human reject</span></td>
<td><span style="font-weight: 400;">7</span></td>
<td><span style="font-weight: 400;">32</span></td>
<td><span style="font-weight: 400;">39</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">Sum</span></td>
<td><span style="font-weight: 400;">12</span></td>
<td><span style="font-weight: 400;">32</span></td>
<td><span style="font-weight: 400;">44</span></td>
</tr>
</tbody>
</table>
<p> </p>
<p><span style="font-weight: 400;">The results showed that the differential testing achieved an accuracy of 84.1% for this sample (5 + 32 / 44), indicating a high overall agreement with the human assessment. However, a closer examination of the confusion matrix revealed a notable discrepancy between precision and recall. Specifically, the testing method demonstrated 100.0% recall in this case study, correctly identifying all 5 instances that humans judged as correct. In contrast, precision was relatively low (41.7%), with 7 false positives out of 12 total positive predictions. This suggests that differential testing reported success on some incorrect patches as well, highlighting the need for manual validation of patch correctness. Despite this shortcoming, the result clearly shows the utility of differential testing in automatically rejecting a substantial number of incorrect patches, which will substantially save the manual validation effort.</span></p>
<h3><span style="font-weight: 400;">Key insights</span></h3>
<p><span style="font-weight: 400;">Our case study revealed several limitations of the current patch generator.</span></p>
<h4><span style="font-weight: 400;">The root cause may not exist in the stack trace</span></h4>
<p><span style="font-weight: 400;">Frequently, crashes are the result of state contamination that occurs prior to the crash being triggered. Consequently, none of the functions within the stack frames may include the code responsible for the root cause. Since our current implementation requires the LLM to assume that the root cause is located within one of the functions in the stack trace, it is unable to generate an accurate patch in such cases. Solving this problem would require a more autonomous agent which can reason about the root cause on its own with a code browsing capability.</span></p>
<h4><span style="font-weight: 400;">Cheating</span></h4>
<p><span style="font-weight: 400;">In some instances, the LLM resorted to “cheating” by producing patches that superficially resolved the issue without addressing the underlying problem. This can occur when the generator modifies or removes code in a way that prevents the crash from occurring, but does not actually fix the root cause of the issue. We observed that cheating happens more frequently when we request the LLM to retry within the same trajectory. A potential solution to this could be to empower the LLM to say “I cannot fix it,” which may come with a tradeoff with success rate. However, note that most of the cheating was caught in the verification step, highlighting the utility of differential testing.</span></p>
<h4><span style="font-weight: 400;">Need for enhanced patch verification methods</span></h4>
<p><span style="font-weight: 400;">Fuzzing and white-box differential testing have shown that a large majority of generated patches are incorrect when compared to the ground-truth patches. This finding highlights the challenge of generating accurate patches without enhanced verification capabilities. To address this gap, several approaches can be considered:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A patch generator could provide additional code context when querying the LLM for a patch so that LLM can better understand the consequence of a code patch.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A patch generator could make additional LLM queries to verify the perseverance of existing functionality.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A patch generator can attempt to generate multiple valid patches by exploring multiple trajectories in parallel, and let LLM choose the best option that is most likely to be correct.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">In a well-tested real-world codebase, a patch generator can utilize existing tests to validate the patches it creates. This process complements building the code and checking for crash reproduction, allowing the patch generator to retry if a patch fails the tests. The accuracy of the generated patches is largely dependent on the thoroughness of the existing tests.</span></li>
</ul>
<p><span style="font-weight: 400;">In conclusion, while our study has identified several challenges with the current patch generation process, it also opens up opportunities for improvement. By addressing these limitations with innovative solutions, we can enhance the accuracy and reliability of patch generation, paving the way for more robust and effective automated tools.</span></p>
<h2><span style="font-weight: 400;">Get started with AutoPatchBench</span></h2>
<p><span style="font-weight: 400;">AutoPatchBench is now available on </span><a href="https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks"><span style="font-weight: 400;">GitHub</span></a><span style="font-weight: 400;">. We welcome pull requests to integrate new/additional agent architectures into the framework, and look forward to seeing how well they perform on AutoPatchBench.</span></p>
<p>The post <a rel="nofollow" href="https://engineering.fb.com/2025/04/29/ai-research/autopatchbench-benchmark-ai-powered-security-fixes/">Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes</a> appeared first on <a rel="nofollow" href="https://engineering.fb.com">Engineering at Meta</a>.</p>
Building Private Processing for AI tools on WhatsApp - Engineering at Metahttps://engineering.fb.com/?p=224342025-04-29T17:15:00.000Z<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">We are inspired by the possibilities of AI to help people be more creative, productive, and stay closely connected on WhatsApp, so we set out to build a new technology that allows our users around the world to use AI in a privacy-preserving way.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">We’re sharing an early look into Private Processing, an optional capability that enables users to initiate a request to a confidential and secure environment and use AI for processing messages where no one — including Meta and WhatsApp — can access them.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">To validate our implementation of these and other security principles, independent security researchers will be able to continuously verify our privacy and security architecture and its integrity.</span></li>
</ul>
<p><span style="font-weight: 400;">AI has revolutionized the way people interact with technology and information, making it possible for people to automate complex tasks and gain valuable insights from vast amounts of data. However, the current state of AI processing — which relies on large language models often running on servers, rather than mobile hardware — requires that users’ requests are visible to the provider. Although that works for many use cases, it presents challenges in enabling people to use AI to process private messages while preserving the level of privacy afforded by end-to-end encryption.</span></p>
<p><span style="font-weight: 400;">We set out to enable AI capabilities with the privacy that people have come to expect from WhatsAp</span><span style="font-weight: 400;">p, so that AI can deliver helpful capabilities, such as summarizing messages, without Meta or WhatsApp having access to them, and in the way that meets the following principles:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Optionality:</b><span style="font-weight: 400;"> Using Meta AI through WhatsApp, including features that use Private Processing, must be optional. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Transparency: </b><span style="font-weight: 400;">We must provide transparency when our features use Private Processing.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>User control:</b><span style="font-weight: 400;"> For people’s most sensitive chats that require extra assurance, they must be able to prevent messages from being used for AI features like mentioning Meta AI in chats, with the help of WhatApp’s </span><a href="https://blog.whatsapp.com/introducing-advanced-chat-privacy" target="_blank" rel="noopener"><span style="font-weight: 400;">Advanced Chat Privacy</span></a><span style="font-weight: 400;"> feature.</span></li>
</ul>
<h2><span style="font-weight: 400;">Introducing Private Processing</span></h2>
<p><span style="font-weight: 400;">We’re excited to share an initial overview of Private Processing, a new technology we’ve built to </span><span style="font-weight: 400;">support people’s needs and aspirations to leverage AI in a secure and privacy-preserving way. This confidential computing infrastructure, built on top of a Trusted Execution Environment (TEE), will make it possible for people to direct AI to </span><span style="font-weight: 400;">process their requests — like summarizing unread WhatsApp threads or getting writing suggestions — in our secure and private cloud environment. In other words, Private Processing will allow users to leverage powerful AI features, while preserving WhatsApp’s core privacy promise, ensuring </span><b>no one except you and the people you’re talking to can access or share your personal messages, not even Meta or WhatsApp. </b></p>
<p><span style="font-weight: 400;">To uphold this level of privacy and security, we designed Private Processing with the following foundational requirements:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Confidential processing:</b><span style="font-weight: 400;"> Private Processing must be built in such a way that prevents any other system from accessing user’s data — including Meta, WhatsApp or any third party — while in processing or in transit to Private Processing.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Enforceable guarantees:</b><span style="font-weight: 400;"> Attempts to modify that confidential processing guarantee must cause the system to fail closed or become publicly discoverable via verifiable transparency.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Verifiable transparency: </b><span style="font-weight: 400;">Users and security researchers must be able to audit the behavior of Private Processing to independently verify our privacy and security guarantees.</span></li>
</ul>
<p><span style="font-weight: 400;">However, we know that technology platforms like ours operate in a highly adversarial environment where threat actors continuously adapt, and software and hardware systems keep evolving, generating unknown risks. As part of our </span><a href="https://engineering.fb.com/2022/07/28/security/five-security-principles-for-billions-of-messages-across-metas-apps/" target="_blank" rel="noopener"><span style="font-weight: 400;">defense-in-depth</span> <span style="font-weight: 400;">ap</span><span style="font-weight: 400;">p</span><span style="font-weight: 400;">roach</span></a><span style="font-weight: 400;"> and best practices for any security-critical system, we’re treating the following additional layers of requirements as core to Private Processing on WhatsApp:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Non-targetability:</b><span style="font-weight: 400;"> An attacker should not be able to target a particular user for compromise without attempting to compromise the entire Private Processing system.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Stateless processing and forward security:</b><span style="font-weight: 400;"> Private Processing must not retain access to user messages once the session is complete to ensure that the attacker can not gain access to historical requests or responses.</span></li>
</ul>
<h3><span style="font-weight: 400;">Threat modeling for Private Processing</span></h3>
<p><span style="font-weight: 400;">Because we set out to meet these high-security requirements, our work to build Private Processing began with developing a threat model to help us identify potential attack vectors and vulnerabilities that could compromise the confidentiality, integrity, or availability of user data. We’ve worked with our peers in the security community to audit the architecture and our implementation to help us continue to harden them. </span></p>
<h3><span style="font-weight: 400;">Building in the open</span></h3>
<p><span style="font-weight: 400;">To help inform our industry’s progress in building private AI processing, and to enable independent security research in this area, we will be publishing components of Private Processing, expanding the scope of our </span><a href="https://bugbounty.meta.com/" target="_blank" rel="noopener"><span style="font-weight: 400;">Bug Bounty program</span></a><span style="font-weight: 400;"> to include Private Processing, and releasing a detailed security engineering design paper, </span><b>as we get closer to the launch of Private Processing in the coming weeks. </b></p>
<p><span style="font-weight: 400;">While AI-enabled processing of personal messages for summarization and writing suggestions at users’ direction is the first use case where Meta applies Private Processing, we expect there will be others where the same or similar infrastructure might be beneficial in processing user requests. We will continue to share our learnings and progress transparently and responsibly.</span></p>
<h2><span style="font-weight: 400;">How Private Processing works</span></h2>
<p><span style="font-weight: 400;">Private Processing creates a secure cloud environment where AI models can analyze and process data without exposing it to unauthorized parties. </span></p>
<p><span style="font-weight: 400;">Here’s how it works:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Authentication: </b><span style="font-weight: 400;">First, Private Processing obtains </span><a href="https://engineering.fb.com/2022/12/12/security/anonymous-credential-service-acs-open-source/" target="_blank" rel="noopener"><span style="font-weight: 400;">anonymous credentials</span></a><span style="font-weight: 400;"> to verify that the future requests are coming from authentic WhatsApp clients.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Third-party routing and load balancing:</b><span style="font-weight: 400;"> In addition to these credentials, Private Processing fetches HPKE encryption public keys from a third-party CDN in order to support </span><span style="font-weight: 400;">Oblivious HTTP</span><span style="font-weight: 400;"> (OHTTP).</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Wire session establishment: </b><span style="font-weight: 400;">Private Processing establishes an OHTTP connection from the user’s device to a Meta gateway via a third-party relay which hides requester IP from Meta and WhatsApp.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Application session establishment:</b><span style="font-weight: 400;"> Private Processing establishes a Remote Attestation + Transport Layer Security (RA-TLS) session between the user’s device and the TEE. The attestation verification step cross-checks the measurements against a third-party ledger to ensure that the client only connects to code which satisfies our verifiable transparency guarantee.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Request to Private Processing: </b><span style="font-weight: 400;">After the above session is established, the device makes a request to Private Processing (e.g., message summarization request), that is</span><span style="font-weight: 400;"> encrypted end-to-end between the device and Private Processing with an ephemeral key that Meta and WhatsApp cannot access. In other words, no one except the user’s device or the selected TEEs can decrypt the request.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Private Processing:</b><span style="font-weight: 400;"> Our AI models process data in a confidential virtual machine (CVM), a type of TEE, without storing any messages, in order to generate a response. CVMs may communicate with other CVMs using the same RA-TLS connection clients use to complete processing. </span></li>
<li style="font-weight: 400;" aria-level="1"><b>Response from Private Processing: </b><span style="font-weight: 400;">The processed results are then returned to the user’s device, encrypted with a key that only the device and the pre-selected Private Processing server ever have access to. </span><span style="font-weight: 400;">Private Processing does not retain access to messages after the session is completed.</span></li>
</ul>
<h2><span style="font-weight: 400;">The threat model</span></h2>
<p><span style="font-weight: 400;">In designing any security-critical system, it is important to develop a threat model to guide how we build its defenses. Our threat model for Private Processing includes three key components:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><b>Assets</b><span style="font-weight: 400;">: The sensitive data and systems that we need to protect.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Threat actors</b><span style="font-weight: 400;">: The individuals or groups that may attempt to compromise our assets.</span></li>
<li style="font-weight: 400;" aria-level="1"><b>Threat scenarios</b><span style="font-weight: 400;">: The ways in which our assets could be compromised, including the tactics, techniques, and procedures (TTPs) that threat actors might use.</span></li>
</ul>
<h3><span style="font-weight: 400;">Assets</span></h3>
<p><span style="font-weight: 400;">In the context of applying Private Processing to </span><span style="font-weight: 400;">summarizing unread messages or providing writing suggestions at users’ direction, </span><span style="font-weight: 400;">we will use Private Processing to protect messaging content, whether they have been received by the user, or still in draft form. We use the term “messages” to refer to these primary assets in the context of this blog.</span></p>
<p><span style="font-weight: 400;">In addition to messages, we also include additional, secondary assets which help support the goal of Private Processing and may interact with or directly process assets: the Trusted Computing Base (TCB) of the Confidential Virtual Machine (CVM), the underlying hardware, and the cryptographic keys used to protect data in transit.</span></p>
<h3><span style="font-weight: 400;">Threat actors</span></h3>
<p><span style="font-weight: 400;">We have identified three threat actor types that could attack our system to attempt to recover assets.</span></p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Malicious or compromised insiders with access to our infrastructure.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">A third party or supply chain vendor with access to components of the infrastructure.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Malicious end users targeting other users on the platform.</span></li>
</ol>
<h3><span style="font-weight: 400;">Threat scenarios</span></h3>
<p><span style="font-weight: 400;">When building Private Processing to be resilient against these threat actors, we consider relevant threat scenarios that may be pursued against our systems, including (but not limited to) the following:</span></p>
<h4><span style="font-weight: 400;">External actors directly exploit the exposed product attack surface or compromise the services running in Private Processing CVMs to extract messages.</span></h4>
<p><span style="font-weight: 400;">Anywhere the system processes untrusted data, there is potentially an attack surface for a threat actor to exploit. Examples of these kinds of attacks include exploitation of zero-day vulnerabilities or attacks unique to AI such as prompt injection. </span></p>
<p><span style="font-weight: 400;">Private Processing is designed to reduce such an attack surface through limiting the exposed entry points to a small set of thoroughly reviewed components which are subject to regular assurance testing.</span><span style="font-weight: 400;"> The service binaries are hardened and run in a containerized environment to mitigate the risks of code execution and limit a compromised binary’s ability to exfiltrate data from within the CVM to an external party.</span></p>
<h4><span style="font-weight: 400;">Internal or external attackers extract messages exposed through the CVM.</span></h4>
<p><span style="font-weight: 400;">Observability and debuggability remains a challenge in highly secure environments as they can be at odds with the goal of confidential computing, potentially exposing side channels to identify data and in the worst case accidentally leaking messages themselves. However, deploying any service at scale requires some level of observability to identify failure modes, since they may negatively impact many users, even when the frequency is uncommon. We implement a log-filtering system to limit export to only allowed log lines, such as error logs.</span></p>
<p><span style="font-weight: 400;">Like any complex system, Private Processing is built of components to form a complex supply chain of both hardware and software. Internally, our CVM build process occurs in restricted environments that maintain provenance and require multi-party review. Transparency of the CVM environment, which we’ll provide </span><span style="font-weight: 400;">through publishing a third-party log of CVM binary digests and CVM binary images</span><span style="font-weight: 400;">, will allow external researchers to analyze, replicate, and report instances where they believe logs could leak user data.</span></p>
<h4><span style="font-weight: 400;">Insiders with physical or remote access to Private Processing hosts interfere with the CVM at boot and runtime, potentially bypassing the protections in order to extract messages.</span></h4>
<p><span style="font-weight: 400;">TEE software exploitation is a growing area of security research, and vulnerability researchers have repeatedly demonstrated the ability to bypass TEE guarantees. Similarly, physical attacks on Private Processing hosts may be used to defeat TEE guarantees or present compromised hosts as legitimate to an end user.</span></p>
<p><span style="font-weight: 400;">To address these unknown risks, we built Private Processing on the principle of defense-in-depth by actively tracking novel vulnerabilities in this space, minimizing and sanitizing untrusted inputs to the TEE, minimizing attack surface through CVM hardening and enabling abuse detection through enhanced host monitoring.</span></p>
<p><span style="font-weight: 400;">Because we know that defending against physical access introduces significant complexity and attack surface even with industry-leading controls, we continuously pursue further attack surface hardening. In addition, we reduce these risks through measures like encrypted DRAM and standard physical security controls to protect our datacenters from bad actors.</span></p>
<p><span style="font-weight: 400;">To further address these unknown risks, we seek to eliminate the viability of targeted attacks via routing sessions through a third-party OHTTP relay to prevent an attacker’s ability to route a specific user to a specific machine.</span></p>
<h2><span style="font-weight: 400;">Designing Private Processing</span></h2>
<p><span style="font-weight: 400;">Here is how we designed Private Processing </span><span style="font-weight: 400;">to meet these foundational security and privacy requirements against the threat model we developed. </span></p>
<p><i><span style="font-weight: 400;">(Further technical documentation and security research engagements updates are coming soon).</span></i></p>
<h3><span style="font-weight: 400;">Confidential processing</span></h3>
<p><span style="font-weight: 400;">Data shared to Private Processing is processed in an environment which does not make it available to any other system. This protection is further upheld by encrypting data end-to-end between the client and the Private Processing application, so that only Private Processing, and no one in between – including Meta, WhatsApp, or any third-party relay – can access the data.</span></p>
<p><span style="font-weight: 400;">To prevent possible user data leakage, only limited service reliability logs are permitted to leave the boundaries of CVM.</span></p>
<h3><span style="font-weight: 400;">System software</span></h3>
<p><span style="font-weight: 400;">To prevent privileged runtime access to Private Processing, we prohibit remote shell access, including from the host machine, and implement security measures including code isolation. Code isolation ensures that only designated code in Private Processing has access to user data. Prohibited remote shell access ensures that neither the host nor a networked user can gain access to the CVM shell.</span></p>
<p><span style="font-weight: 400;">We defend against potential source control and supply chain attacks by implementing established industry best practices. This includes building software exclusively from checked-in source code and artifacts, where any change requires multiple engineers to modify the build artifacts or build pipeline.</span></p>
<p><span style="font-weight: 400;">As another layer of security, all code changes are auditable. This allows us to ensure that any potential issues are discovered — either through our continuous internal audits of code, or by external security researchers auditing our binaries.</span></p>
<h3><span style="font-weight: 400;">System hardware</span></h3>
<p><span style="font-weight: 400;">Private Processing utilizes CPU-based confidential virtualization technologies, along with Confidential Compute mode GPUs, which prevent certain classes of attacks from the host operating system, as well as certain physical attacks.</span></p>
<h3><span style="font-weight: 400;">Enforceable guarantees</span></h3>
<p><span style="font-weight: 400;">Private Processing utilizes CPU-based confidential virtualization technologies which allow attestation of software based in a hardware root of trust to guarantee the security of the system prior to each client-server connection. Before any data is transmitted, Private Processing checks these attestations, and confirms them against a third-party log of acceptable binaries.</span></p>
<h3><span style="font-weight: 400;">Stateless and forward secure service</span></h3>
<p><span style="font-weight: 400;">We operate Private Processing as a stateless service, which neither stores nor retains access to messages after the session has been completed.</span></p>
<p><span style="font-weight: 400;">Additionally, Private Processing does not store messages to disk or external storage, and thus does not maintain durable access to this data.</span></p>
<p><span style="font-weight: 400;">As part of our data minimization efforts, requests to Private Processing </span><span style="font-weight: 400;">only include data that is useful for processing the prompt — for example, message summarization will only include the messages the user directed AI to summarize.</span></p>
<h3><span style="font-weight: 400;">Non-targetability</span></h3>
<p><span style="font-weight: 400;">Private Processing implements</span><span style="font-weight: 400;"> the OHTTP protocol to establish a secure session with Meta routing layers. This ensures that Meta and WhatsApp do not know which user is connecting to what CVM. In other words, </span><span style="font-weight: 400;">Meta and WhatsApp do not know the user that initiated a request to Private Processing while the request is in route, so that a specific user cannot be routed to any specific hardware.</span></p>
<p><span style="font-weight: 400;">Private Processing uses anonymous credentials to authenticate users over OHTTP. This way, Private Processing can authenticate users to the Private Processing system, but remains unable to identify them. Private Processing does not include any other identifiable information as part of the request during the establishment of a system session. </span><span style="font-weight: 400;">We limit the impact of small-scale attacks by ensuring that they cannot be used to target the data of a specific user.</span></p>
<h3><span style="font-weight: 400;">Verifiable transparency</span></h3>
<p><span style="font-weight: 400;">To provide users visibility into the processing of their data and aid in validation of any client-side behaviors, we will provide capabilities to obtain an in-app log of requests made to Private Processing, data shared with it, and details of how that secure session was set up. </span></p>
<p><span style="font-weight: 400;">In order to provide verifiability, we will make available the CVM image binary powering Private Processing. We will make these components available to researchers to allow independent, external verification of our implementation.</span></p>
<p><span style="font-weight: 400;">In addition, to enable deeper bug bounty research in this area, we will publish source code for certain components of the system, including our attestation verification code or load bearing code.</span></p>
<p><span style="font-weight: 400;">We will also be expanding the scope of our existing </span><a href="https://bugbounty.meta.com/"><span style="font-weight: 400;">Bug Bounty program</span></a><span style="font-weight: 400;"> to cover Private Processing to enable further independent security research into Private Processing’s design and implementation. </span></p>
<p><span style="font-weight: 400;">Finally, we will be publishing a detailed technical white paper on the security engineering design of Private Processing to provide further transparency into our security practices, and aid others in the industry in building similar systems.</span></p>
<h2><span style="font-weight: 400;">Get Involved</span></h2>
<p><span style="font-weight: 400;">We’re deeply committed to providing our users with the best possible messaging experience while ensuring that only they and the people they’re talking to can access or share their personal messages. Private Processing is a critical component of this commitment, and we’re excited to make it available in the coming weeks.</span></p>
<p><span style="font-weight: 400;">We welcome feedback from our users, researchers, and the broader security community through our security research program:</span></p>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">More details: </span><a href="https://bugbounty.meta.com"><span style="font-weight: 400;">Meta Bug Bounty</span></a></li>
<li style="font-weight: 400;" aria-level="1"><a href="mailto:bugbounty@meta.com"><span style="font-weight: 400;">Contact us</span></a></li>
</ul>
<p>The post <a rel="nofollow" href="https://engineering.fb.com/2025/04/29/security/whatsapp-private-processing-ai-tools/">Building Private Processing for AI tools on WhatsApp</a> appeared first on <a rel="nofollow" href="https://engineering.fb.com">Engineering at Meta</a>.</p>
How It’s Made: Little Language Lessons uses Gemini’s multilingual capabilities to personalize language learning - Google Developers Bloghttps://developers.googleblog.com/en/how-its-made-little-language-lessons-to-personalize-learning/2025-04-29T16:04:02.000ZLittle Language Lessons, a project leveraging Gemini's API and Cloud services to generate content, translate, and provide text-to-speech functionalities, includes vocabulary lessons, slang practice, and object recognition for language learning.Cutting through the noise: How to prioritize Dependabot alerts - The GitHub Bloghttps://github.blog/?p=864542025-04-29T16:00:39.000Z<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>Let’s be honest: that flood of security alerts in your inbox can feel completely overwhelming. We’ve been there too.</p>
<p>As a developer advocate and a product manager focused on security at GitHub, we’ve seen firsthand how overwhelming it can be to triage vulnerability alerts. Dependabot is fantastic at spotting vulnerabilities, but without a smart way to prioritize them, you might be burning time on minor issues or (worse) missing the critical ones buried in the pile.</p>
<p>So, we’ve combined our perspectives—one from the security trenches and one from the developer workflow side—to share how we use <a href="https://www.first.org/epss/">Exploit Prediction Scoring System (EPSS)</a> scores and repository properties to transform the chaos into clarity and make informed prioritization decisions.</p>
<h2 id="understanding-software-supply-chain-security" id="understanding-software-supply-chain-security" >Understanding software supply chain security<a href="#understanding-software-supply-chain-security" class="heading-link pl-2 text-italic text-bold" aria-label="Understanding software supply chain security"></a></h2>
<p>If you’re building software today, you’re not just writing code—you’re assembling it from countless open source packages. In fact, <a href="https://www.linuxfoundation.org/research/census-iii?hsLang=en">96% of modern applications</a> are powered by open source software. With such widespread adoption, open source software has become a prime target for malicious actors looking to exploit vulnerabilities at scale.</p>
<p>Attackers continuously probe these projects for weaknesses, contributing to the thousands of Common Vulnerabilities and Exposures (CVEs) <a href="https://github.blog/security/supply-chain-security/securing-the-open-source-supply-chain-the-essential-role-of-cves/">reported</a> each year. But not all vulnerabilities carry the same level of risk. The key question becomes not just how to address vulnerabilities, but how to intelligently prioritize them based on your specific application architecture, deployment context, and business needs.</p>
<h2 id="understanding-epss-probability-of-exploitation-with-severity-if-it-happens" id="understanding-epss-probability-of-exploitation-with-severity-if-it-happens" >Understanding EPSS: probability of exploitation with severity if it happens<a href="#understanding-epss-probability-of-exploitation-with-severity-if-it-happens" class="heading-link pl-2 text-italic text-bold" aria-label="Understanding EPSS: probability of exploitation with severity if it happens"></a></h2>
<p>When it comes to prioritization, many teams still rely solely on severity scores like the Common Vulnerability Scoring System (CVSS). But not all “critical” vulnerabilities are equally likely to be exploited. That’s where EPSS comes in—it tells you the probability that a vulnerability will actually be exploited in the wild within the next 30 days.</p>
<p>Think of it this way: CVSS tells you how bad the damage could be if someone broke into your house, while EPSS tells you how likely it is that someone is actually going to try. Both pieces of information are crucial! This approach allows you to focus resources effectively.</p>
<p>As security pro Daniel Miessler points out in <a href="https://danielmiessler.com/blog/efficient-security-principle">Efficient Security Principle</a>, “The security baseline of an offering or system faces continuous downward pressure from customer excitement about, or reliance on, the offering in question.”</p>
<p>Translation? We’re always balancing security with usability, and we need to be smart about where we focus our limited time and energy. EPSS helps us spot the vulnerabilities with a higher likelihood of exploitation, allowing us to fix the most pressing risks first.</p>
<h2 id="smart-prioritization-steps" id="smart-prioritization-steps" >Smart prioritization steps<a href="#smart-prioritization-steps" class="heading-link pl-2 text-italic text-bold" aria-label="Smart prioritization steps"></a></h2>
<h3 id="1-combine-epss-with-cvss" id="1-combine-epss-with-cvss" >1. Combine EPSS with CVSS<a href="#1-combine-epss-with-cvss" class="heading-link pl-2 text-italic text-bold" aria-label="1. Combine EPSS with CVSS"></a></h3>
<p>One approach is to look at both likelihood (EPSS) and potential impact (CVSS) together. It’s like comparing weather forecasts—you care about both the chance of rain <em>and</em> how severe the storm might be.</p>
<p>For example, when prioritizing what to fix first, a vulnerability with:</p>
<ul>
<li>EPSS: 85% (highly likelihood of exploitation) </li>
<li>CVSS: 9.8 (critical severity)</li>
</ul>
<p>…should almost always take priority over one with:</p>
<ul>
<li>EPSS: 0.5% (much less likely to be exploited) </li>
<li>CVSS: 9.0 (critical severity)</li>
</ul>
<p>Despite both having red-alert CVSS ratings, the first vulnerability is the one keeping us up at night.</p>
<h3 id="2-leverage-repository-properties-for-context-aware-prioritization" id="2-leverage-repository-properties-for-context-aware-prioritization" >2. Leverage repository properties for context-aware prioritization<a href="#2-leverage-repository-properties-for-context-aware-prioritization" class="heading-link pl-2 text-italic text-bold" aria-label="2. Leverage repository properties for context-aware prioritization"></a></h3>
<p>Not all code is created equal when it comes to security risk. Ask yourself:</p>
<ul>
<li>Is this repo public or private? (Public repositories expose vulnerabilities to potential attackers) </li>
<li>Does it handle sensitive data like customer info or payments? </li>
<li>How often do you deploy? (Frequent deployments face tighter remediation times)</li>
</ul>
<p>One way to provide context-aware prioritization systematically is with <a href="https://docs.github.com/en/organizations/managing-organization-settings/managing-custom-properties-for-repositories-in-your-organization#about-custom-properties">custom repository properties</a>, which allow you to add contextual information about your repositories with information such as compliance frameworks, data sensitivity, or project details. By applying these custom properties to your repositories, you create a structured classification system that helps you identify the “repos that matter,” so you can prioritize Dependabot alerts for your production code rather than getting distracted by your totally-not-a-priority <code>test-vulnerabilities-local</code> repo.</p>
<h3 id="3-establish-clear-response-service-level-agreements-slas-based-on-risk-levels" id="3-establish-clear-response-service-level-agreements-slas-based-on-risk-levels" >3. Establish clear response Service Level Agreements (SLAs) based on risk levels<a href="#3-establish-clear-response-service-level-agreements-slas-based-on-risk-levels" class="heading-link pl-2 text-italic text-bold" aria-label="3. Establish clear response Service Level Agreements (SLAs) based on risk levels"></a></h3>
<p>Once you’ve done your homework on both the vulnerability characteristics and your repository context in your organization, you can establish clear timelines for responses that make sense for your organization resources and risk tolerance.</p>
<p>Let’s see how this works in real life: Here’s an example risk matrix that combines both EPSS (likelihood of exploitation) and CVSS (severity of impact).</p>
<div class="content-table-wrap"><table>
<thead>
<tr>
<th align="left">EPSS ↓ / CVSS →</th>
<th align="left">Low</th>
<th align="left">Medium</th>
<th align="left">High</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Low</strong></td>
<td align="left">✅ When convenient</td>
<td align="left">⏳ Next sprint</td>
<td align="left">⚠️ Fix Soon</td>
</tr>
<tr>
<td align="left"><strong>Medium</strong></td>
<td align="left">⏳ Next sprint</td>
<td align="left">⚠️ Fix soon</td>
<td align="left">🔥 Fix soon</td>
</tr>
<tr>
<td align="left"><strong>High</strong></td>
<td align="left">⚠️ Fix Soon</td>
<td align="left">🔥 Fix soon</td>
<td align="left">🚨 Fix first</td>
</tr>
</tbody>
</table></div>
<p>Say you get an alert about a vulnerability in your payment processing library that has both a high EPSS score and high CVSS rating. Red alert! Looking at our matrix, that’s a “Fix first” situation. You’ll probably drop what you’re doing, and put in some quick mitigations while the team works on a proper fix.</p>
<p>But what about that low-risk vulnerability in some testing utility that nobody even uses in production? Low EPSS, low CVSS… that can probably wait until “when convenient” within the next few weeks. No need to sound the alarm or pull developers off important feature work.</p>
<p>This kind of prioritization just makes sense. Applying the same urgency to every single vulnerability just leads to alert fatigue and wasted resources, and having clear guidelines helps your team know where to focus first.</p>
<h2 id="integration-with-enterprise-governance" id="integration-with-enterprise-governance" >Integration with enterprise governance<a href="#integration-with-enterprise-governance" class="heading-link pl-2 text-italic text-bold" aria-label="Integration with enterprise governance"></a></h2>
<p>For enterprise organizations, GitHub’s <a href="https://docs.github.com/en/code-security/dependabot/dependabot-auto-triage-rules/about-dependabot-auto-triage-rules">auto-triage rules</a> help provide consistent management of security alerts at scale across multiple teams and repositories.</p>
<p>Auto-triage rules allow you to create custom criteria for automatically handling alerts based on factors like severity, EPSS, scope, package name, CVE, ecosystem, and manifest location. You can create your own custom rules to control how Dependabot auto-dismisses and reopens alerts, so you can focus on the alerts that matter.</p>
<p>These rules are particularly powerful because they:</p>
<ul>
<li>Apply to both existing and future alerts. </li>
<li>Allow for proactive filtering of false positives. </li>
<li>Enable “snooze until patch” functionality for vulnerabilities without a fix available. </li>
<li>Provide visibility into automated decisions through the auto-dismiss alert resolution.</li>
</ul>
<p>GitHub-curated presets like auto-dismissal of false positives are free for everyone and all repositories, while custom auto-triage rules are available for free on public repositories and as part of GitHub Advanced Security for private repositories.</p>
<h2 id="the-real-world-impact-of-smart-prioritization" id="the-real-world-impact-of-smart-prioritization" >The real-world impact of smart prioritization<a href="#the-real-world-impact-of-smart-prioritization" class="heading-link pl-2 text-italic text-bold" aria-label="The real-world impact of smart prioritization"></a></h2>
<p>When teams get prioritization right, organizations can experience significant improvements in security management. Research firmly supports this approach: The comprehensive <a href="https://www.cyentia.com/epss-study/">Cyentia EPSS study</a> found teams could achieve 87% coverage of exploited vulnerabilities by focusing on just 10% of them, dramatically reducing necessary remediation efforts by 83% compared to traditional CVSS-based approaches. This isn’t just theoretical, it translates to real-world efficiency gains.</p>
<p>This reduction is not just about numbers. When security teams provide clear reasoning behind prioritization decisions, developers gain a better understanding of security requirements. This transparency builds trust between teams, potentially leading to more efficient resolution processes and improved collaboration between security and development teams.</p>
<p>The most successful security teams pair smart automation with human judgment and transparent communication. This shift from alert overload to smart filtering lets teams focus on what truly matters, turning security from a constant headache into a manageable, strategic advantage.</p>
<h2 id="getting-started" id="getting-started" >Getting started<a href="#getting-started" class="heading-link pl-2 text-italic text-bold" aria-label="Getting started"></a></h2>
<p>Ready to tame that flood of alerts? Here’s how to begin:</p>
<ul>
<li><strong>Enable Dependabot security updates</strong>: If you haven’t already, <a href="https://www.youtube.com/watch?v=yvXKlDgiGHo">turn on Dependabot alerts</a> and automatic security updates in your repository settings. This is your first line of defense!
</li>
<li>
<p><strong>Set up auto-triage rules</strong>: <a href="https://docs.github.com/en/code-security/dependabot/dependabot-auto-triage-rules/about-dependabot-auto-triage-rules">Create custom rules</a> based on severity, scope, package name, and other criteria to automatically handle low-priority alerts. Auto-triage rules are a powerful tool to help you reduce false positives and alert fatigue substantially, while better managing your alerts at scale.</p>
</li>
<li>
<p><strong>Establish clear prioritization criteria</strong>: Define what makes a vulnerability critical for your specific projects. Develop a clear matrix for identifying critical issues, considering factors like impact assessment, system criticality, and exploit likelihood.</p>
</li>
<li>
<p><strong>Consult your remediation workflow for priority alerts:</strong> Verify the vulnerability’s authenticity and develop a quick mitigation strategy based on your organization’s risk response matrix.</p>
</li>
</ul>
<p>By implementing these smart prioritization strategies, you’ll help focus your team’s energy where it matters most: keeping your code secure and your customers protected. No more security alert overload, just focused, effective prioritization.</p>
<div class="post-content-cta"><p><strong>Want to streamline security alert management for your organization?</strong> Start using Dependabot for free or unlock advanced prioritization with <a href="https://github.com/security/advanced-security">GitHub Code Security</a> today.</p>
</div>
</body></html>
<p>The post <a href="https://github.blog/security/application-security/cutting-through-the-noise-how-to-prioritize-dependabot-alerts/">Cutting through the noise: How to prioritize Dependabot alerts</a> appeared first on <a href="https://github.blog">The GitHub Blog</a>.</p>
Usability and safety updates to Google Auth Platform - Google Developers Bloghttps://developers.googleblog.com/en/usability-and-safety-updates-to-google-auth-platform/2025-04-28T19:29:01.000ZUpdates to the Google Auth Platform include changes to OAuth configuration, client secrets display, and automatic deletion of unused clients, making the platform more secure and easier to use.