-
Notifications
You must be signed in to change notification settings - Fork 6.7k
feat: Exporting traces to Traceloop and Instana #8209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CodSpeed Performance ReportMerging #8209 will degrade performances by 15.75%Comparing Summary
Benchmarks breakdown
|
CI is currently skipped because should-run-ci is evaluated as false. Could you please help me look into this? @ogabrielluiz |
start_time=self._get_current_timestamp(), | ||
) | ||
self.root_span.set_attribute(SpanAttributes.SESSION_ID, self.session_id or self.flow_id) | ||
self.root_span.set_attribute(SpanAttributes.OPENINFERENCE_SPAN_KIND, self.trace_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be the normal OpenTelemetry span kind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced OpenInference span kinds with Opentelemetry span kinds. Please review it.
""" WalkthroughA new tracing integration called "traceloop" was added to the tracing service, including a dedicated tracer implementation using OpenTelemetry. The service and its tests were updated to support this tracer. Four new OpenTelemetry-related dependencies were added to the project configuration to support the new functionality. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TracingService
participant TraceloopTracer
participant OpenTelemetry
User->>TracingService: start_tracers()
TracingService->>TraceloopTracer: _initialize_traceloop_tracer()
TraceloopTracer->>OpenTelemetry: setup_traceloop()
TracingService->>TracingService: Register traceloop tracer in context
User->>TracingService: Trace events (add_trace, end_trace, end)
TracingService->>TraceloopTracer: add_trace / end_trace / end
TraceloopTracer->>OpenTelemetry: Create/finish spans, record attributes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (5)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/backend/base/langflow/services/tracing/traceloop.py (1)
77-88
: 🛠️ Refactor suggestionRoot span should be created with
kind=otel_span_kind
instead of custom attributeOpenTelemetry uses the
kind
argument when starting a span; setting a custom attributespan.kind
is only a convention and makes the UI depend on non-standard fields.-self.root_span = self.tracer.start_span( - name=self.flow_id, - start_time=self._get_current_timestamp(), -) +self.root_span = self.tracer.start_span( + name=self.flow_id, + kind=otel_span_kind, + start_time=self._get_current_timestamp(), +)
🧹 Nitpick comments (3)
src/backend/base/langflow/services/tracing/service.py (1)
56-60
: Nit: keep helper import functions alphabetically groupedThe helper loaders are currently ordered alphabetically except for the new
_get_traceloop_tracer
, which was appended at the end of the block.
Re-ordering keeps the file easier to skim:def _get_opik_tracer(): ... return OpikTracer +# --- New --- +def _get_traceloop_tracer(): + from langflow.services.tracing.traceloop import TraceloopTracer + return TraceloopTracer +src/backend/base/langflow/services/tracing/traceloop.py (2)
181-187
: Passkind
directly when creating the child spanSimilar to the root span, rely on the API parameter instead of a custom attribute:
-otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL) -child_span.set_attribute("span.kind", otel_span_kind.name.lower()) +otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL) +# let OpenTelemetry know the semantic kind +child_span.update_name(trace_name) # keep the name +child_span.kind = otel_span_kind(or simply include
kind=otel_span_kind
instart_span
).
265-296
:types.NoneType
is only available from Python 3.10+The project already requires 3.10, so it’s fine, but the check for
NoneType
redundantly convertsNone
to the string"None"
; OpenTelemetry can encode null values natively.
Optional: remove this branch and letjson.dumps
serialisenull
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (4)
pyproject.toml
(1 hunks)src/backend/base/langflow/services/tracing/service.py
(3 hunks)src/backend/base/langflow/services/tracing/traceloop.py
(1 hunks)src/backend/tests/unit/services/tracing/test_tracing_service.py
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/backend/base/langflow/services/tracing/service.py (1)
src/backend/base/langflow/services/tracing/traceloop.py (1)
TraceloopTracer
(46-343)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Update Starter Projects
- GitHub Check: Run Ruff Check and Format
- GitHub Check: Optimize new Python code in this PR
- GitHub Check: Ruff Style Check (3.13)
- GitHub Check: Run benchmarks (3.12)
🔇 Additional comments (4)
pyproject.toml (1)
128-131
:❓ Verification inconclusive
Pin OpenTelemetry packages to the same minor/patch level to avoid ABI mismatches
All four OpenTelemetry libraries should stay in lock-step (API, SDK, exporter and instrumentation).
Declaring them with independent>=
ranges makes it easy for the resolver to pick versions that don’t agree on the public interfaces (e.g. 1.32.0opentelemetry-api
with 1.30.0opentelemetry-sdk
).
Consider pinning them to the same version, or at least the same~=1.30
constraint:-"opentelemetry-instrumentation-langchain>=0.33.0", -"opentelemetry-exporter-otlp>=1.30.0", -"opentelemetry-sdk>=1.30.0", -"opentelemetry-api>=1.30.0", +"opentelemetry-instrumentation-langchain>=0.33.0,<0.34", +"opentelemetry-exporter-otlp~=1.30.0", +"opentelemetry-sdk~=1.30.0", +"opentelemetry-api~=1.30.0",This will minimise “Module xyz has no attribute ...” errors at runtime.
You can quickly check what versions would be selected by Poetry/Hatch with:
🏁 Script executed:
#!/bin/bash uv pip install -r requirements.txt --dry-run | grep opentelemetryLength of output: 160
🏁 Script executed:
#!/bin/bash grep -R "opentelemetry" -n pyproject.tomlLength of output: 221
🏁 Script executed:
#!/bin/bash grep -R "opentelemetry" -n poetry.lock || echo "poetry.lock not found or no entries for OpenTelemetry"Length of output: 200
Ensure OpenTelemetry dependencies stay in lock-step
In pyproject.toml (lines 128–131), you currently have:
"opentelemetry-instrumentation-langchain>=0.33.0", "opentelemetry-exporter-otlp>=1.30.0", "opentelemetry-sdk>=1.30.0", "opentelemetry-api>=1.30.0",
Using
>=
ranges independently allows the resolver to mix minor/patch releases (e.g. api 1.32 with sdk 1.30), which can lead to runtime “Module xyz has no attribute …” errors. Pin them to the same series:-"opentelemetry-instrumentation-langchain>=0.33.0", -"opentelemetry-exporter-otlp>=1.30.0", -"opentelemetry-sdk>=1.30.0", -"opentelemetry-api>=1.30.0", +"opentelemetry-instrumentation-langchain~=0.33.0", +"opentelemetry-exporter-otlp~=1.30.0", +"opentelemetry-sdk~=1.30.0", +"opentelemetry-api~=1.30.0",This ensures all four packages are resolved in lock-step.
• File: pyproject.toml
• Lines: 128–131To verify, run your lock-file tool (e.g.
poetry lock --no-update && poetry show opentelemetry-api
) and confirm all Otel packages share the same minor version.src/backend/tests/unit/services/tracing/test_tracing_service.py (1)
142-146
: Great – traceloop tracer is correctly patched into the service tests
No issues spotted here.src/backend/base/langflow/services/tracing/service.py (1)
245-246
: 👍 Traceloop tracer is now included in the start-up sequencesrc/backend/base/langflow/services/tracing/traceloop.py (1)
241-246
:chat_input_value
/chat_output_value
are never populated
end()
writes these attributes, but no method currently sets them.
Either populate them when you receive user / model messages or drop the attributes to avoid empty fields in the exported spans.
src/backend/tests/unit/services/tracing/test_tracing_service.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/backend/base/langflow/services/tracing/traceloop.py (1)
180-181
: Confirm the critical issue fix.I can see that the critical issue from previous reviews has been addressed - the child span is now properly stored in
self.child_spans
at line 181, which will prevent theKeyError
inend_trace
.
🧹 Nitpick comments (5)
src/backend/base/langflow/services/tracing/traceloop.py (5)
285-286
: Simplify isinstance check for BaseMessage types.Since
HumanMessage
andSystemMessage
are subclasses ofBaseMessage
, checking forBaseMessage
alone is sufficient.- elif isinstance(value, (BaseMessage | HumanMessage | SystemMessage)): + elif isinstance(value, BaseMessage):
325-326
: Remove redundant isinstance check.The
isinstance(error, Exception)
check is redundant since the parameter is already typed asException | None
and we've already verifiederror
exists.- if isinstance(error, Exception): - current_span.record_exception(error) - else: + current_span.record_exception(error) + + # Add additional exception details for non-standard exceptions + if not isinstance(error, Exception):
93-95
: Consider more specific exception handling.The broad exception catch in
__init__
might hide important setup issues. Consider logging at warning level or handling specific exceptions differently.- except Exception: # noqa: BLE001 - logger.opt(exception=True).debug("Error setting up Traceloop tracer") + except Exception as e: # noqa: BLE001 + logger.opt(exception=True).warning(f"Error setting up Traceloop tracer: {e}") self._ready = False
147-156
: Consider graceful degradation for LangChain instrumentation.The LangChain instrumentation failure causes the entire tracer setup to fail. Consider making this optional since the core tracing functionality can work without it.
try: from opentelemetry.instrumentation.langchain import LangchainInstrumentor LangchainInstrumentor().instrument(tracer_provider=self.tracer_provider, skip_dep_check=True) except ImportError: - logger.exception( + logger.warning( "Could not import LangchainInstrumentor." "Please install it with `pip install opentelemetry-instrumentation-langchain`." ) - return False + # Continue without LangChain instrumentation return True
114-116
: Consider making project name configurable.The project name is hardcoded as "LANGFLOW". Consider making this configurable through the constructor or environment variables for better flexibility.
- project_name = "LANGFLOW" + project_name = os.getenv("TRACELOOP_PROJECT_NAME", self.project_name or "LANGFLOW")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/backend/base/langflow/services/tracing/service.py
(3 hunks)src/backend/base/langflow/services/tracing/traceloop.py
(1 hunks)src/backend/tests/unit/services/tracing/test_tracing_service.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/backend/tests/unit/services/tracing/test_tracing_service.py
- src/backend/base/langflow/services/tracing/service.py
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Run benchmarks (3.12)
- GitHub Check: Ruff Style Check (3.13)
- GitHub Check: Update Starter Projects
- GitHub Check: Run Ruff Check and Format
- GitHub Check: Optimize new Python code in this PR
string_stacktrace = traceback.format_exception(error) | ||
error_message = f"{error.__class__.__name__}: {error}\n\n{string_stacktrace}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix traceback formatting bug.
traceback.format_exception()
returns a list of strings, not a single string. You need to join the list.
- string_stacktrace = traceback.format_exception(error)
- error_message = f"{error.__class__.__name__}: {error}\n\n{string_stacktrace}"
+ string_stacktrace = "".join(traceback.format_exception(type(error), error, error.__traceback__))
+ error_message = f"{error.__class__.__name__}: {error}\n\n{string_stacktrace}"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
string_stacktrace = traceback.format_exception(error) | |
error_message = f"{error.__class__.__name__}: {error}\n\n{string_stacktrace}" | |
string_stacktrace = "".join(traceback.format_exception(type(error), error, error.__traceback__)) | |
error_message = f"{error.__class__.__name__}: {error}\n\n{string_stacktrace}" |
🤖 Prompt for AI Agents
In src/backend/base/langflow/services/tracing/traceloop.py at lines 304-305, the
variable string_stacktrace is assigned the result of
traceback.format_exception(error), which returns a list of strings. To fix the
traceback formatting bug, join the list into a single string before including it
in error_message. Use a string join method to concatenate the list elements into
one string.
elif isinstance(value, (types.GeneratorType | types.NoneType)): | ||
value = str(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix deprecated types.NoneType usage.
types.NoneType
doesn't exist in Python 3.10+. Use type(None)
instead.
- elif isinstance(value, (types.GeneratorType | types.NoneType)):
+ elif isinstance(value, (types.GeneratorType, type(None))):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
elif isinstance(value, (types.GeneratorType | types.NoneType)): | |
value = str(value) | |
elif isinstance(value, (types.GeneratorType, type(None))): | |
value = str(value) |
🤖 Prompt for AI Agents
In src/backend/base/langflow/services/tracing/traceloop.py around lines 291 to
292, replace the deprecated usage of types.NoneType with type(None) in the
isinstance check. Change the condition from isinstance(value,
(types.GeneratorType | types.NoneType)) to isinstance(value,
(types.GeneratorType, type(None))) to ensure compatibility with Python 3.10 and
later.
if trace_type == "prompt": | ||
child_span.set_attribute("span.kind", SpanKind.INTERNAL) | ||
else: | ||
otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL) | ||
child_span.set_attribute("span.kind", otel_span_kind.name.lower()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent span kind attribute setting.
The span kind setting logic is inconsistent between trace types. For "prompt", you're setting the attribute directly to SpanKind.INTERNAL
, but for other types, you're converting the enum to a lowercase string.
- # Map trace types to OpenTelemetry span kinds
- if trace_type == "prompt":
- child_span.set_attribute("span.kind", SpanKind.INTERNAL)
- else:
- otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL)
- child_span.set_attribute("span.kind", otel_span_kind.name.lower())
+ # Map trace types to OpenTelemetry span kinds
+ otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL)
+ child_span.set_attribute("span.kind", otel_span_kind.name.lower())
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if trace_type == "prompt": | |
child_span.set_attribute("span.kind", SpanKind.INTERNAL) | |
else: | |
otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL) | |
child_span.set_attribute("span.kind", otel_span_kind.name.lower()) | |
# Map trace types to OpenTelemetry span kinds | |
otel_span_kind = trace_type_mapping.get(trace_type, SpanKind.INTERNAL) | |
child_span.set_attribute("span.kind", otel_span_kind.name.lower()) |
🤖 Prompt for AI Agents
In src/backend/base/langflow/services/tracing/traceloop.py around lines 184 to
188, the span kind attribute is set inconsistently: for "prompt" it is assigned
the enum SpanKind.INTERNAL directly, while for other trace types it is assigned
the lowercase string of the enum name. To fix this, ensure that the span kind
attribute is always set as a lowercase string representation of the enum name
for all trace types, including "prompt", by converting SpanKind.INTERNAL to its
lowercase name string.
This PR introduces support for exporting traces to TraceLoop by:
Adding a new module "traceloop.py" for TraceLoop-specific tracing configuration and instrumentation which also supports IBM Instana.
Initializing TraceLoop tracing from "service.py" to ensure it's active when LangFlow starts.
Including necessary dependencies for TraceLoop and OpenTelemetry in "pyproject.toml".
Adding comprehensive tests for TraceLoop integration.
📁 Changes Made
➕ langflow/src/backend/base/langflow/services/tracing/traceloop.py:
Contains the TraceLoop tracer setup using OpenTelemetry.
🔁 Modified service.py:
Initialized TraceLoop tracer via import and setup during tracing service startup.
🧩 Updated pyproject.toml:
Added dependencies required for TraceLoop integration (e.g., opentelemetry-sdk, opentelemetry-exporter-otlp, etc.).
🧪 Enhanced test_tracing_service.py:
Added unit tests for TraceLoop tracer initialization, configuration validation, and integration with the tracing service.
🎯 Purpose
This update enables LangFlow to export telemetry data (spans, traces) to TraceLoop and Instana for improved observability and debugging, laying the foundation for robust distributed tracing.
Screenshots of the Traceloop and Instana Dashboards:


Summary by CodeRabbit