通过MCP一键集成和Computer Use Agent模型，实现Azure Foundry中的AI自动化

2025年6月30日

| Azure AI

最近在折腾Azure AI Foundry时，发现它现在可以通过OpenAI的Responses API无缝集成Model Context Protocol（MCP）服务器。以前要自己写MCP客户端，挺麻烦的，现在直接配置就能用，开发Agentic AI方案变得简单多了。

场景：时尚趋势发现

举个例子，假如你是时尚分析师，只需要在命令行里输入类似“可持续时尚的最新趋势”这样的查询，系统就能自动帮你搞定后续所有流程：

……

阅读全文

微软 Azure OpenAI API 版本生命周期解读

2025年6月17日

| Azure AI

在当前人工智能技术飞速发展的浪潮中，微软的 Azure OpenAI 服务成为众多企业与开发者构建智能应用的首选平台。要高效、可持续地利用这些服务，理解 API 版本的生命周期（API Version Lifecycle）至关重要。本文将深入解读 Azure OpenAI API 的版本生命周期策略，帮助您更好地应对服务更新和迭代带来的挑战。

一、什么是 API 版本生命周期？

……

阅读全文

增强AI安全性：Azure Prompt Shields 与 Azure AI Content Safety

2025年6月11日

| Azure AI

近年来，生成式AI的应用越来越广泛，但随之而来的安全问题也愈发突出。其中，提示注入攻击（Prompt Injection）已成为对AI系统的主要威胁之一。为了应对这一挑战，Azure推出了Prompt Shields和Azure AI Content Safety，帮助开发者保护AI系统免受直接和间接威胁。

……

阅读全文

精通模型上下文协议（MCP）：基于 Azure OpenAI 构建多服务器 MCP 实践

2025年6月7日

| Azure AI

引言

模型上下文协议（Model Context Protocol，简称 MCP）正在迅速成为打造高度智能和可互操作 AI 应用的重要标准框架。大多数有关 MCP 的资料聚焦于单服务器部署，而在本文中，我们将介绍如何结合 Azure OpenAI，基于多服务器 MCP 架构实现一个可扩展、可定制的智能体平台，让你能够在一个前端界面下，便捷地连接和编排多个工具服务器，实现跨领域智能体工具协作。

……

阅读全文

Azure AI Search 推出多向量字段支持和语义排序增强功能

2025年5月29日

| Azure AI

Azure AI Search 最近发布了两个强大的新功能：多向量字段支持和语义排序的评分配置文件集成。这些功能是基于用户反馈开发的，为搜索体验提供了更多控制和更多应用场景。

为什么这些增强功能很重要

随着搜索体验变得越来越复杂，处理复杂的多模态数据并保持精确的相关性变得至关重要。这些新功能直接解决了常见的痛点：

……

阅读全文

使用 Semantic Kernel 插件编排 AI 代理的技术深度解析

2025年5月8日

| Azure AI

在如今快速发展的 大型语言模型（LLM） 领域，编排专门的 AI 代理已成为构建复杂认知系统的关键，这些系统能够进行复杂推理和任务执行。虽然功能强大，但协调多个具有独特能力和数据访问权限的代理会带来显著的工程挑战。微软的 Semantic Kernel（SK）通过其直观的插件系统为管理这种复杂性提供了强大的框架。本文将深入探讨如何利用 SK 插件实现高效的代理编排，并结合实际实现模式进行说明。

……

阅读全文

利用 WebSocket 实现 GPT-4o-transcribe 和 GPT-4o-mini-transcribe 的实时语音转录

2025年5月3日

| Azure AI

Azure OpenAI 最近增强了其语音识别产品，推出了两款令人印象深刻的模型：GPT-4o-transcribe 和 GPT-4o-mini-transcribe。它们的一个关键特性是利用 WebSocket 连接进行实时音频流转录。这为开发者们提供了构建语音转文本应用的先进工具。本文将深入探讨这些模型的工作原理，并提供一个用 Python 实现的实用示例。

……

阅读全文

语音识别与合成中的延迟问题及解决策略

2024年8月9日

| Azure AI

语音识别和合成的延迟可能是创建无缝和高效应用程序的一个重大障碍。减少延迟不仅可以改善用户体验，还可以提升实时应用程序的整体性能。本文将探讨在一般转录、实时转录、文件转录和语音合成中减少延迟的策略。

1. 网络延迟：将语音资源移近应用程序

导致语音识别延迟的主要因素之一是网络延迟。为了减轻这一延迟，关键是减少应用程序与语音识别资源之间的距离。以下是一些建议：

……

阅读全文

Azure OpenAI 语音聊天

2024年5月15日

| Azure AI

本文介绍使用 Azure AI 语音与 Azure OpenAI 服务实现全语音对话聊天，以及如何改进非阻塞式的对话。

要点

Azure AI 语音服务识别文本
将文本发送到 Azure OpenAI，获取流式回复
Azure AI 语音服务对流式响应的文本合成语音

代码示例（Python）

以下是 Python 版本的示例，想要了解更多语言的示例，请参考 OpenAI-Speech

安装依赖

1
2


pip install azure-cognitiveservices-speech
pip install openai

添加代码

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117


import os
import azure.cognitiveservices.speech as speechsdk
from openai import AzureOpenAI

# This example requires environment variables named 
# "OPEN_AI_KEY", "OPEN_AI_ENDPOINT" and "OPEN_AI_DEPLOYMENT_NAME"

# Your endpoint should look like the following:
# https://YOUR_OPEN_AI_RESOURCE_NAME.openai.azure.com/
client = AzureOpenAI(
azure_endpoint=os.environ.get('OPEN_AI_ENDPOINT'),
api_key=os.environ.get('OPEN_AI_KEY'),
api_version="2024-05-01-preview"
)

# This will correspond to the custom name you chose for 
# your deployment when you deployed a model.
deployment_id=os.environ.get('OPEN_AI_DEPLOYMENT_NAME')

# This example requires environment variables 
# named "SPEECH_KEY" and "SPEECH_REGION"
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), 
                                       region=os.environ.get('SPEECH_REGION'))
audio_output_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)

# Should be the locale for the speaker's language.
speech_config.speech_recognition_language="zh-CN"
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, 
                                               audio_config=audio_config)

# The language of the voice that responds on behalf of Azure OpenAI.
speech_config.speech_synthesis_voice_name='zh-CN-YunyiMultilingualNeural'
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, 
                                                 audio_config=audio_output_config)
# tts sentence end mark
tts_sentence_end = [ ".", "!", "?", ";", "。", "！", "？", "；", "\n" ]

# Prompts Azure OpenAI with a request and synthesizes the response.
def ask_openai(prompt):
    # Ask Azure OpenAI in streaming way
    response = client.chat.completions.create(model=deployment_id, 
                                              max_tokens=200, 
                                              stream=True, 
                                              messages=[
                                                {"role": "user", "content": prompt}
                                              ])
    collected_messages = []
    last_tts_request = None

    # iterate through the stream response stream
    for chunk in response:
        if len(chunk.choices) > 0:
            # extract the message
            chunk_message = chunk.choices[0].delta.content  
            if chunk_message is not None:
                # save the message
                collected_messages.append(chunk_message)  
                # sentence end found
                if chunk_message in tts_sentence_end: 
                    # join the recieved message together to build a sentence
                    text = ''.join(collected_messages).strip() 
                    if text != '': 
                        # if sentence only have \n or space, we could skip
                        print(f"Speech synthesized to speaker for: {text}")
                        last_tts_request = speech_synthesizer
                                                    .speak_text_async(text)
                        collected_messages.clear()
    if last_tts_request:
        last_tts_request.get()

# Continuously listens for speech input to recognize 
# and send as text to Azure OpenAI
def chat_with_open_ai():
    while True:
        print("""Azure OpenAI is listening. 
               Say 'Stop' or press Ctrl-Z to end the conversation.""")
        try:
            # Get audio from the microphone and 
            # then send it to the TTS service.
            speech_recognition_result = speech_recognizer.
                                           recognize_once_async().get()

            # If speech is recognized, send it to Azure OpenAI 
            # and listen for the response.
            if speech_recognition_result.reason == speechsdk.ResultReason
                                                         .RecognizedSpeech:
                if speech_recognition_result.text == "Stop." 
                             or speech_recognition_result.text == "Stop。":
                    print("Conversation ended.")
                    break
                print("Recognized speech: {}".format(
                                            speech_recognition_result.text))
                ask_openai(speech_recognition_result.text)
            elif speech_recognition_result.reason == speechsdk.ResultReason
                                                                  .NoMatch:
                print("No speech could be recognized: {}".format(
                                speech_recognition_result.no_match_details))
                break
            elif speech_recognition_result.reason == speechsdk.ResultReason
                                                                  .Canceled:
                cancellation_details = speech_recognition_result
                                                      .cancellation_details
                print("Speech Recognition canceled: {}".format(
                                               cancellation_details.reason))
                if cancellation_details.reason == speechsdk.CancellationReason
                                                                    .Error:
                    print("Error details: {}".format(
                                        cancellation_details.error_details))
        except EOFError:
            break

# Main
try:
    chat_with_open_ai()
except Exception as err:
    print("Encountered exception. {}".format(err))

非阻塞式改进

前面的示例运行起来的效果是固定一问一答交流的，如果希望实现对话的过程是可以被打断的，可以改变识别语音的代码为非阻塞式。即把调用 ask_openai 的地方改为创建另一个线程来执行。

……

阅读全文

使用 Semantic Kernel 构建自定义 Copilot

2024年3月17日

| Azure AI

本文重点介绍如何使用由 Azure OpenAI 服务提供支持的 Semantic Kernel 创建自己的 Copilot。我们将尝试利用大型语言模型（LLM）的优势与外部服务的集成。这将使您了解如何真正实现您的 Copilot 目标，不仅与零售业，而且与任何行业，无论是电力和公用事业，政府和公共部门等。它的整体功能和潜在的应用场景都远超于聊天机器人。

……

阅读全文

Azure AI 中的文章

通过MCP一键集成和Computer Use Agent模型，实现Azure Foundry中的AI自动化

场景：时尚趋势发现

微软 Azure OpenAI API 版本生命周期解读

增强AI安全性：Azure Prompt Shields 与 Azure AI Content Safety

精通模型上下文协议（MCP）：基于 Azure OpenAI 构建多服务器 MCP 实践

引言

Azure AI Search 推出多向量字段支持和语义排序增强功能

为什么这些增强功能很重要

使用 Semantic Kernel 插件编排 AI 代理的技术深度解析

利用 WebSocket 实现 GPT-4o-transcribe 和 GPT-4o-mini-transcribe 的实时语音转录

语音识别与合成中的延迟问题及解决策略

1. 网络延迟：将语音资源移近应用程序

Azure OpenAI 语音聊天

要点

代码示例（Python）

安装依赖

添加代码

非阻塞式改进

使用 Semantic Kernel 构建自定义 Copilot

最近文章

分类

标签

友情链接

其它