Skip to content

Count and Count distinct functions for tsds #128530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pabloem
Copy link
Contributor

@pabloem pabloem commented May 27, 2025

PTAL @dnhatn ? : )
LMK if you have any questions!

@pabloem pabloem marked this pull request as ready for review May 28, 2025 15:57
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label May 28, 2025
@dnhatn dnhatn added :StorageEngine/TSDB You know, for Metrics :Analytics/ES|QL AKA ESQL >non-issue and removed needs:triage Requires assignment of a team area label labels May 28, 2025
@dnhatn dnhatn requested review from dnhatn and limotova May 28, 2025 17:18
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine labels May 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@dnhatn dnhatn requested a review from not-napoleon May 28, 2025 17:19
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments, but this looks great. Thanks @pabloem

@FunctionInfo(returnType = { "integer", "long" }, description = "The count over time value of a field.", type = FunctionType.AGGREGATE)
public CountOverTime(
Source source,
@Param(name = "field", type = { "aggregate_metric_double", "double", "integer", "long" }) Expression field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method supports more types rather than these. Can you copy the list from the count function?


@Override
public CountDistinct perTimeSeriesAggregation() {
// TODO(pabloem): Do we need to take in a precision parameter here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should have it.

description = "The count of distinct values over time for a field.",
type = FunctionType.AGGREGATE
)
public DistinctOverTime(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call it CountDistinctOverTime? I'm okay with DistinctOverTime, but I prefer to keep it consistent with count_distinct.

)
public DistinctOverTime(
Source source,
@Param(name = "field", type = { "aggregate_metric_double", "double", "integer", "long", "float" }) Expression field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I think this function supports more than these types. Can you copy the list from count_distinct?

Copy link
Contributor

@limotova limotova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one question

CountOverTime::new
);

@FunctionInfo(returnType = { "integer", "long" }, description = "The count over time value of a field.", type = FunctionType.AGGREGATE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what scenario does this return integer? The returnType and dataType() for Count appears to only return longs

);

@FunctionInfo(
returnType = { "integer", "long" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before about when does this return integers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >non-issue :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants