Skip to content

[Question]: UnprocessableEntityError of AzureOpenAI #18860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
Zihan-Zhu opened this issue May 27, 2025 · 3 comments
Open
1 task done

[Question]: UnprocessableEntityError of AzureOpenAI #18860

Zihan-Zhu opened this issue May 27, 2025 · 3 comments
Labels
question Further information is requested

Comments

@Zihan-Zhu
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

Dear team,

When I tried build and run a simple FunctionAgent and using a QueryEngineTool, there is a error pop-up like this:

---------------------------------------------------------------------------
UnprocessableEntityError                  Traceback (most recent call last)
File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/context.py:618, in Context._step_worker(self, name, step, config, stepwise, verbose, checkpoint_callback, run_id, service_manager, dispatcher)
    617 try:
--> 618     new_ev = await instrumented_step(**kwargs)
    619     kwargs.clear()

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py:370, in Dispatcher.span.<locals>.async_wrapper(func, instance, args, kwargs)
    369 try:
--> 370     result = await func(*args, **kwargs)
    371 except BaseException as e:

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/agent/workflow/multi_agent_workflow.py:394, in AgentWorkflow.run_agent_step(self, ctx, ev)
    392 tools = await self.get_tools(ev.current_agent_name, user_msg_str or "")
--> 394 agent_output = await agent.take_step(
    395     ctx,
    396     ev.input,
    397     tools,
    398     memory,
    399 )
    401 ctx.write_event_to_stream(agent_output)

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/agent/workflow/function_agent.py:48, in FunctionAgent.take_step(self, ctx, llm_input, tools, memory)
     47 last_chat_response = ChatResponse(message=ChatMessage())
---> 48 async for last_chat_response in response:
     49     tool_calls = self.llm.get_tool_calls_from_response(  # type: ignore
     50         last_chat_response, error_on_no_tool_call=False
     51     )

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py:88, in llm_chat_callback.<locals>.wrap.<locals>.wrapped_async_llm_chat.<locals>.wrapped_gen()
     87 try:
---> 88     async for x in f_return_val:
     89         dispatcher.event(
     90             LLMChatInProgressEvent(
     91                 messages=messages,
   (...)
     94             )
     95         )

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/llms/openai/base.py:783, in OpenAI._astream_chat.<locals>.gen()
    782 first_chat_chunk = True
--> 783 async for response in await aclient.chat.completions.create(
    784     messages=message_dicts,
    785     **self._get_model_kwargs(stream=True, **kwargs),
    786 ):
    787     response = cast(ChatCompletionChunk, response)

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py:2028, in AsyncCompletions.create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, prediction, presence_penalty, reasoning_effort, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, web_search_options, extra_headers, extra_query, extra_body, timeout)
   2027 validate_response_format(response_format)
-> 2028 return await self._post(
   2029     "/chat/completions",
   2030     body=await async_maybe_transform(
   2031         {
   2032             "messages": messages,
   2033             "model": model,
   2034             "audio": audio,
   2035             "frequency_penalty": frequency_penalty,
   2036             "function_call": function_call,
   2037             "functions": functions,
   2038             "logit_bias": logit_bias,
   2039             "logprobs": logprobs,
   2040             "max_completion_tokens": max_completion_tokens,
   2041             "max_tokens": max_tokens,
   2042             "metadata": metadata,
   2043             "modalities": modalities,
   2044             "n": n,
   2045             "parallel_tool_calls": parallel_tool_calls,
   2046             "prediction": prediction,
   2047             "presence_penalty": presence_penalty,
   2048             "reasoning_effort": reasoning_effort,
   2049             "response_format": response_format,
   2050             "seed": seed,
   2051             "service_tier": service_tier,
   2052             "stop": stop,
   2053             "store": store,
   2054             "stream": stream,
   2055             "stream_options": stream_options,
   2056             "temperature": temperature,
   2057             "tool_choice": tool_choice,
   2058             "tools": tools,
   2059             "top_logprobs": top_logprobs,
   2060             "top_p": top_p,
   2061             "user": user,
   2062             "web_search_options": web_search_options,
   2063         },
   2064         completion_create_params.CompletionCreateParamsStreaming
   2065         if stream
   2066         else completion_create_params.CompletionCreateParamsNonStreaming,
   2067     ),
   2068     options=make_request_options(
   2069         extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
   2070     ),
   2071     cast_to=ChatCompletion,
   2072     stream=stream or False,
   2073     stream_cls=AsyncStream[ChatCompletionChunk],
   2074 )

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/_base_client.py:1742, in AsyncAPIClient.post(self, path, cast_to, body, files, options, stream, stream_cls)
   1739 opts = FinalRequestOptions.construct(
   1740     method="post", url=path, json_data=body, files=await async_to_httpx_files(files), **options
   1741 )
-> 1742 return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/_base_client.py:1549, in AsyncAPIClient.request(self, cast_to, options, stream, stream_cls)
   1548     log.debug("Re-raising status error")
-> 1549     raise self._make_status_error_from_response(err.response) from None
   1551 break

UnprocessableEntityError: Error code: 422 - {'error': {'code': 'OperationParameterInvalid', 'message': 'Streaming responses are not currently supported'}}

The above exception was the direct cause of the following exception:

WorkflowRuntimeError                      Traceback (most recent call last)
Cell In[10], line 32
     29     elif isinstance(ev, AgentStream):
     30         print(ev.delta, end="", flush=True)
---> 32 response = await handler
     34 #response = await agent.run("what is the leg1 definition for IR single currency swap?")
     36 print(response)

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/workflow.py:403, in Workflow.run.<locals>._run_workflow()
    399 if exception_raised:
    400     # cancel the stream
    401     ctx.write_event_to_stream(StopEvent())
--> 403     raise exception_raised
    405 if not we_done:
    406     # cancel the stream
    407     ctx.write_event_to_stream(StopEvent())

File /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/context.py:627, in Context._step_worker(self, name, step, config, stepwise, verbose, checkpoint_callback, run_id, service_manager, dispatcher)
    625 except Exception as e:
    626     if config.retry_policy is None:
--> 627         raise WorkflowRuntimeError(
    628             f"Error in step '{name}': {e!s}"
    629         ) from e
    631     delay = config.retry_policy.next(
    632         retry_start_at + time.time(), attempts, e
    633     )
    634     if delay is None:
    635         # We're done retrying

WorkflowRuntimeError: Error in step 'run_agent_step': Error code: 422 - {'error': {'code': 'OperationParameterInvalid', 'message': 'Streaming responses are not currently supported'}}

ERROR:asyncio:Exception in callback Dispatcher.span.<locals>.wrapper.<locals>.handle_future_result(span_id='Workflow.run...-55e878da5779', bound_args=<BoundArgumen...StartEvent())>, instance=<llama_index....x7f2667fdba10>, context=<_contextvars...x7f266cc9b080>)(<WorkflowHand...upported'}}")>) at /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py:276
handle: <Handle Dispatcher.span.<locals>.wrapper.<locals>.handle_future_result(span_id='Workflow.run...-55e878da5779', bound_args=<BoundArgumen...StartEvent())>, instance=<llama_index....x7f2667fdba10>, context=<_contextvars...x7f266cc9b080>)(<WorkflowHand...upported'}}")>) at /ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py:276>
Traceback (most recent call last):
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/context.py", line 618, in _step_worker
    new_ev = await instrumented_step(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 370, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/agent/workflow/multi_agent_workflow.py", line 394, in run_agent_step
    agent_output = await agent.take_step(
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/agent/workflow/function_agent.py", line 48, in take_step
    async for last_chat_response in response:
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 88, in wrapped_gen
    async for x in f_return_val:
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 783, in gen
    async for response in await aclient.chat.completions.create(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 2028, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/_base_client.py", line 1742, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/openai/_base_client.py", line 1549, in request
    raise self._make_status_error_from_response(err.response) from None
openai.UnprocessableEntityError: Error code: 422 - {'error': {'code': 'OperationParameterInvalid', 'message': 'Streaming responses are not currently supported'}}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 288, in handle_future_result
    raise exception
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3506, in run_code
    await eval(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_7268/2093183576.py", line 32, in <module>
    response = await handler
               ^^^^^^^^^^^^^
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/workflow.py", line 403, in _run_workflow
    raise exception_raised
  File "/ebs/dataiku/dssdata/code-envs/python/zzh_py311/lib/python3.11/site-packages/llama_index/core/workflow/context.py", line 627, in _step_worker
    raise WorkflowRuntimeError(
llama_index.core.workflow.errors.WorkflowRuntimeError: Error in step 'run_agent_step': Error code: 422 - {'error': {'code': 'OperationParameterInvalid', 'message': 'Streaming responses are not currently supported'}}

My code looks like:

llm = AzureOpenAI(
    model="gpt-4o",
    deployment_name="gpt-4o",
    #api_key=api_key,
    azure_ad_token_provider=token_provider,  
    use_azure_ad=True,  
    azure_endpoint=azure_endpoint,
    api_version=api_version,
 
)

embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="text-embedding-ada-002",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
   
)

Settings.llm = llm
Settings.embed_model = embed_model

from llama_index.core.tools import QueryEngineTool
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.workflow import Context
from llama_index.core.agent.workflow import ToolCallResult, AgentStream

query_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine, 
    name="xxx", 
    description="xxx",
)

# Create an agent workflow with our tool
agent = FunctionAgent(
    tools=[query_tool],
    system_prompt="xxx",
)

ctx = Context(agent)

handler = agent.run("question?", ctx=ctx)

async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print(
            f"Call {ev.tool_name} with args {ev.tool_kwargs}\nReturned: {ev.tool_output}"
        )
    elif isinstance(ev, AgentStream):
        print(ev.delta, end="", flush=True)

response = await handler

print(response)

Is there anyone know how to solve this issue? Thanks!

Best,
Zihan

@Zihan-Zhu Zihan-Zhu added the question Further information is requested label May 27, 2025
Copy link

dosubot bot commented May 27, 2025

Hi @Zihan-Zhu! I'm Dosu and I’m helping the llama_index team.

The error you're seeing (UnprocessableEntityError: Error code: 422 - {'error': {'code': 'OperationParameterInvalid', 'message': 'Streaming responses are not currently supported'}) occurs because AzureOpenAI does not currently support streaming responses for chat completions. This is a limitation of the Azure OpenAI service itself, not LlamaIndex.

According to the LlamaIndex streaming documentation, streaming is supported by certain LLMs such as OpenAI, HuggingFaceLLM, and most LangChain LLMs, but AzureOpenAI is not listed among them. When you attempt to use streaming with an LLM that doesn't support it, you will encounter an error like the one you described.

How to resolve this:

  • You will need to disable streaming when using AzureOpenAI. In your code, avoid using streaming-related methods or parameters (such as stream=True), and instead use the standard completion methods that return the full response at once.
  • If you require streaming, you would need to use a provider that supports it, such as OpenAI's own API.

Summary:
AzureOpenAI does not support streaming responses for chat completions, which is why you are seeing this error. To resolve it, use non-streaming methods with AzureOpenAI, or switch to a provider that supports streaming if that functionality is required for your application.
Reference

To reply, just mention my name @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

@logan-markewich
Copy link
Collaborator

logan-markewich commented May 27, 2025

Im not sure why, but it says streaming is not supported. Is there something about your azure deployment that would disable streaming? First time ive ever seen that error from azure

LLMs that support streaming are required to use agents right now

@Zihan-Zhu
Copy link
Author

Thank you for your reply, since I used the Azure OpenAI api from the organization, I guess they probably close the 'streaming' option?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants