-
Notifications
You must be signed in to change notification settings - Fork 2.8k
The generated SQL does not uses indexes and performs sub-optimally #5949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is interesting, I'm using a similar query query {
tracks (order_by: {album: {title:asc}} limit: 10) {
id
}
} As expected, the SQL that is generated is: explain analyze SELECT
coalesce(
json_agg(
"root"
ORDER BY
"root.or.album.pg.title" ASC NULLS LAST
),
'[]'
) AS "root"
FROM
(
SELECT
"_2_root.or.album"."root.or.album.pg.title" AS "root.or.album.pg.title",
row_to_json(
(
SELECT
"_3_e"
FROM
(
SELECT
"_0_root.base"."id" AS "id"
) AS "_3_e"
)
) AS "root"
FROM
(
SELECT
*
FROM
"public"."tracks"
WHERE
('true')
) AS "_0_root.base"
LEFT OUTER JOIN LATERAL (
SELECT
"_1_root.or.album.base"."title" AS "root.or.album.pg.title"
FROM
(
SELECT
*
FROM
"public"."albums"
WHERE
(("_0_root.base"."album_id") = ("id"))
) AS "_1_root.or.album.base"
) AS "_2_root.or.album" ON ('true')
ORDER BY
"root.or.album.pg.title" ASC NULLS LAST
LIMIT
10
) AS "_4_root" when explained gives out a similar plan:
A hand written query (like yours) generates a much better plan: SELECT
t.id
FROM
tracks t
INNER JOIN albums a ON (t.album_id = a.id)
ORDER BY
a.title ASC NULLS LAST
LIMIT 10
However the two queries are semantically different - they don't fetch the same data, with explain analyze SELECT
coalesce(
json_agg(
"root"
ORDER BY
"root.or.album.pg.title" ASC NULLS LAST
),
'[]'
) AS "root"
FROM
(
SELECT
"_2_root.or.album"."root.or.album.pg.title" AS "root.or.album.pg.title",
row_to_json(
(
SELECT
"_3_e"
FROM
(
SELECT
"_0_root.base"."id" AS "id"
) AS "_3_e"
)
) AS "root"
FROM
(
SELECT
*
FROM
"public"."tracks"
WHERE
('true')
) AS "_0_root.base"
INNER JOIN LATERAL (
SELECT
"_1_root.or.album.base"."title" AS "root.or.album.pg.title"
FROM
(
SELECT
*
FROM
"public"."albums"
WHERE
(("_0_root.base"."album_id") = ("id"))
) AS "_1_root.or.album.base"
) AS "_2_root.or.album" ON ('true')
ORDER BY
"root.or.album.pg.title" ASC NULLS LAST
LIMIT
10
) AS "_4_root" The explain output is similar to that of the hand-written query
So the question now remains - what should graphql-engine generate when it sees a query such as this: query {
tracks (order_by: {album: {title:asc}} limit: 10) {
id
}
}
SELECT
t.id
FROM
tracks t
INNER JOIN albums a ON (t.album_id = a.id)
ORDER BY
a.title ASC NULLS LAST
LIMIT 10 SELECT
t.id
FROM
tracks t
LEFT JOIN albums a ON (t.album_id = a.id)
ORDER BY
a.title ASC NULLS LAST
LIMIT 10 Unfortunately though Postgres's query planner doesn't seem to account for this and as a result we see a sub-optimal plan. Can this be fixed at graphql-engine's layer? Sure, we can generate |
Would it be possible to include it in GraphQL engine? We could try to take a stab at implementing this. The reason for this initiative is that we need the optimization because it blocks certain functionality in a project of ours. At the same time we do not know how hard it would be to include the optimization in Postgres and how long such a change would take. |
Looks like this is fairly simple to implement in graphql-engine. We already have |
@0x777 What should be done to merge the work from that branch into graphql engine? How long could it take? |
We are experiencing a performance issue related to the fact that Hasura generates a sub-optional SQL query.
Simplified, let's say that we have
order
table which hasid
,supplierA
, andsupplierB
fields. We also havesupplier
table witha
,b
, andname
fields.The relation is the following:
We issue a query like this:
The idea here is to get
order
s which are ordered byname
s of corresponding suppliers. Here is the SQL code that Hasura generates:If we run
EXPLAIN ANALYZE
on it, we can see the following output:An important detail here is that there exist indexes:
Yet they seem to be unused in this case.
However, if we re-write the query manually:
The indexes are used and
EXPLAIN ANALYZE
returns the following result:Would it be possible to improve performance of the generated SQL in the cases like this?
The text was updated successfully, but these errors were encountered: