Tips: mapjoin in long-tail optimization in SQL

When joining a big table, saying A, (e.g.10M records), with a small table, saying B (e.g. 1000 records), based on some keys, the long-tail issue may popup. For example, if you have 10M product & category name records, A, and another product name and category name identity table, B. Now you want a new table patch product, category name, & category name identity together, you may write code like,

DROP TABLE IF EXISTS prd_cat_id;
CREATE TABLE IF NOT EXISTS prd_cat_id LIFECYCLE 7 AS
SELECT A.product, A.category_name, B.cat_id 
FROM A
JOIN B
ON A.category_name=B.category_name
;

Then you find the job running slowly, even instances allocated many, saying 1000. When you investigate the logview, it is found many instances completed very fast, only a few, saying 10 instances running very slow. It is the long-tail issue, i.e. some instances allocated too many samples than others, which causes overall performance very worse.

To solve the issue, what you need is to allocate almost same samples among all instances. In the case, mapjoin, can help you. The magic mapjoin can significantly improve SQL code performance.

DROP TABLE IF EXISTS prd_cat_id;
CREATE TABLE IF NOT EXISTS prd_cat_id LIFECYCLE 7 AS
SELECT /*+ mapjoin(B) */
A.product, A.category_name, B.cat_id 
FROM A
JOIN B
ON A.category_name=B.category_name
;

Rather than single small table, mapjoin can support join multiple small tables at the same time. For example, if there is another table, saying C, which mapping category name into nation, i.e. category name is nation dependent. Just write code like

DROP TABLE IF EXISTS prd_cat_id;
CREATE TABLE IF NOT EXISTS prd_cat_id LIFECYCLE 7 AS
SELECT /*+ mapjoin(B,C) */
A.product, A.category_name, B.cat_id, C.nation 
FROM A
JOIN B
ON A.category_name=B.category_name
JOIN C
ON A.category_name=C.category_name
;

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s