Skip to content

Preserve integer-exact representation of numbers and allow conversion of floating point values to integers #128540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

efd6
Copy link

@efd6 efd6 commented May 28, 2025

This differentiates integral floating point values that are within the range of exact integer correspondence and renders them as integers rather than using the default java toString method which formats floating point values in E-notation when the absolute value is at least 1e7.

The new formatting is used as the input string for the INTEGER and LONG convert methods enabling conversion from floating point to integer numbers.

ref: elastic/beats#43659

  • Have you signed the contributor license agreement? ✅
  • Have you followed the contributor guidelines? ✅
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed. ✅
  • If submitting code, have you checked that your submission is for an OS and architecture that we support? ✅
  • If you are submitting this code for a class then read our policy for that.

Questions

Currently only successful convert operations are tested for the new change. Do we want tests for long/int converts from floating point inputs that have fractional parts or are outside the exact integer representation range? These are expected (intended) to error out. For the string convert from the same floating point values, we just expect the string to either include a decimal or be in E-notation respectively.

@efd6 efd6 self-assigned this May 28, 2025
@elasticsearchmachine
Copy link
Collaborator

@efd6 please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

@elasticsearchmachine elasticsearchmachine added v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels May 28, 2025
@efd6 efd6 force-pushed the integer_precise_convert branch 9 times, most recently from 324e01d to 6bcb33c Compare May 28, 2025 07:40
@efd6 efd6 changed the title Preserve integer-exact representation of numbers and allow conversion of floating point values to integers Preserve integer-exact representation of numbers in convert processor to string May 28, 2025
@efd6 efd6 force-pushed the integer_precise_convert branch 2 times, most recently from 841b7ae to 3bc17b7 Compare May 28, 2025 11:03
@efd6 efd6 marked this pull request as ready for review May 28, 2025 12:13
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label May 28, 2025
@@ -102,6 +102,14 @@ public Object convert(Object value) {
STRING {
@Override
public Object convert(Object value) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did want to be able to use this in the LONG and INTEGER convert functions in place of the toString call there, but this does not appear to be possible. Doing that would allow this to naturally solve #128160.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on why that doesn't appear to be possible?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected this to work

diff --git a/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java b/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java
index eaaf97da0e9..1fb1ebc5d6a 100644
--- a/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java
+++ b/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java
@@ -34,7 +34,7 @@ public final class ConvertProcessor extends AbstractProcessor {
             @Override
             public Object convert(Object value) {
                 try {
-                    String strValue = value.toString();
+                    String strValue = STRING.convert(value);
                     if (strValue.startsWith("0x") || strValue.startsWith("-0x")) {
                         return Integer.decode(strValue);
                     }
@@ -49,7 +49,7 @@ public final class ConvertProcessor extends AbstractProcessor {
             @Override
             public Object convert(Object value) {
                 try {
-                    String strValue = value.toString();
+                    String strValue = STRING.convert(value);
                     if (strValue.startsWith("0x") || strValue.startsWith("-0x")) {
                         return Long.decode(strValue);
                     }

but it did not.

Execution failed for task ':modules:ingest-common:compileJava'.
> Compilation failed; see the compiler output below.
Note: Recompile with -Xlint:deprecation for details.
/dev/shm/bk/bk-agent-prod-gcp-1748468850671853465/elastic/elasticsearch-pull-request/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java:37: error: incompatible types: Object cannot be converted to String
                    String strValue = STRING.convert(value);
                                                    ^
/dev/shm/bk/bk-agent-prod-gcp-1748468850671853465/elastic/elasticsearch-pull-request/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java:52: error: incompatible types: Object cannot be converted to String
                    String strValue = STRING.convert(value);
                                                    ^
Note: Some input files use or override a deprecated API.
2 errors

Copy link
Author

@efd6 efd6 May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, does it need a coercion? String strValue = (String) STRING.convert(value);

Yes. Yes it does.

@efd6 efd6 requested a review from joegallo May 28, 2025 12:16
}
if (isExactIntegerFloat(value)) {
Long l = Long.valueOf((long) ((Float) value).floatValue());
return l.toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--- a/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java
+++ b/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/ConvertProcessor.java
@@ -103,12 +103,10 @@ public final class ConvertProcessor extends AbstractProcessor {
             @Override
             public Object convert(Object value) {
                 if (isExactIntegerDouble(value)) {
-                    Long l = Long.valueOf((long) ((Double) value).doubleValue());
-                    return l.toString();
+                    return String.valueOf(((Double) value).longValue());
                 }
                 if (isExactIntegerFloat(value)) {
-                    Long l = Long.valueOf((long) ((Float) value).floatValue());
-                    return l.toString();
+                    return String.valueOf(((Float) value).longValue());
                 }
                 return value.toString();
             }

I'm pretty sure this can golf down a bit, which I think makes it a bit clearer, and I think it'll be faster this way since it saves some boxing/unboxing conversions.

@@ -149,6 +157,24 @@ public static Type fromString(String processorTag, String propertyName, String t
);
}
}

private static boolean isExactIntegerFloat(Object value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this (and the other one) would be better as isExactLongFloat? There's the mathematical notion of integers, of course, and then the actual java int type, but since this is using long for the work it seems like maybe Long would be better in the name. This is just a thought, though -- feel free to push back on it. (For example, another satisfactory solution might be a docstring.)

Copy link
Author

@efd6 efd6 May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "integer" here is the mathematical concept. That long is used as the integer type to hold the value is an implementation detail that I don't think anyone should care about beyond the type being wider than the mantissa of the floating point type that's being checked (float could use int, but I kept them the same for simplicity).

@joegallo joegallo added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.19.0 labels May 28, 2025
@joegallo joegallo self-assigned this May 28, 2025
@joegallo joegallo added the Team:Data Management Meta label for data/management team label May 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label May 28, 2025
@efd6 efd6 force-pushed the integer_precise_convert branch from bc5fde8 to d6072d8 Compare May 28, 2025 21:53
@efd6
Copy link
Author

efd6 commented May 29, 2025

The elasticsearch-ci/part-2 failure appears to be unrelated.

:distribution:bwc:staged:buildBwcLinuxTar FAILED	
Downloading https://services.gradle.org/distributions/gradle-8.14-all.zip	
 [8.19.0] .....................10%.....................20%......................30%.........	
Exception in thread "main" java.net.SocketException: Connection reset
<snip>

@efd6 efd6 requested a review from joegallo May 29, 2025 00:30
@efd6 efd6 changed the title Preserve integer-exact representation of numbers in convert processor to string Preserve integer-exact representation of numbers and allow conversion of floating point values to integers May 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @efd6, I've created a changelog YAML for you.

@@ -34,7 +34,7 @@ enum Type {
@Override
public Object convert(Object value) {
try {
String strValue = value.toString();
String strValue = (String) STRING.convert(value);
if (strValue.startsWith("0x") || strValue.startsWith("-0x")) {
Copy link
Author

@efd6 efd6 May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that this (the existing code, strValue.startsWith("0x") || strValue.startsWith("-0x")) incorrectly classifies hex float as hex integer and will fail if value is a string representing a hex floating point value. I imagine that this will affect ~0 people.

efd6 added 7 commits May 29, 2025 13:50
… to string

This differentiates floating point values that are within the range of
exact integer correspondence and renders them as integers rather than
using the default java toString method which formats floating point
values in E-notation when the absolute value is at least 1e6.
This was introduced from testConvertDouble where it is also unused.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants