Scatter-Gather pattern

Scatter-Gather pattern

architecturedesign patterns

Headshot Bronislav Klučka, Sep 26, 2025, 09:44 AM

Scatter-Gather is another member of the family of design patterns dedicated to separation of concern. The main benefit of the scatter-gather pattern is parallelization—the ability to perform several operations at once. It is therefore suitable for environments and languages that offer parallelization, have threads, workers, or can run new processes.

Example: spam filter

We have already covered spam filters, but let's take another look at them from a slightly different angle.

Rules:

  • We need to check the content of the email against the spam filter.

  • We need to check the email headers (DKIM, SPF, etc.).

  • We need to check the sender.

Pseudo-implementation - rules engine

interface Email {
	text: string;
	sender: string;
	headers: string[];
}

interface SpamRule {
	/**
	 * function accepting email and returning score modifier
	 * @param email
	 * @returns
	 */
	processEmail: (email: Email) => number;
}

class SpamEngine {
	protected rules: SpamRule[] = [];

	addRules(rules: SpamRule[]) {
		this.rules.push(...rules);
	}

	execute(email: Email): boolean {
		const score = this.rules.reduce((acc, rule) => {
			return acc + rule.processEmail(email);
		}, 0);
		return score > 0;
	}
}

class SpamText implements SpamRule {
	processEmail(email: Email): number {
		// process
		return 1;
	}
}

class SpamSender implements SpamRule {
	processEmail(email: Email): number {
		// process
		return 1;
	}
}

class SpamHeaders implements SpamRule {
	processEmail(email: Email): number {
		// process
		return 1;
	}
}

/************************************
 * programm.mts
 */

const spamEngine = new SpamEngine();
spamEngine.addRules([
	new SpamText(),
	new SpamSender(),
	new SpamHeaders(),
]);

console.log(spamEngine.execute({
	sender: 'john.doe@company.com',
	headers: ['from: john.doe@company.com'],
	text: 'this is spam',
}));

The above implementation is perfectly fine, but it has one drawback: processes run serially, one after another. In some cases, such as in the pipeline pattern, it is necessary to maintain the order, but in this case, the order does not matter, and scatter-gather can speed up processing.

Scatter-Gather

Scatter-Gather

The essence of the scatter-gather pattern is that instead of performing the calculation itself in the main thread, the code delegates the calculation to parallel processes.

Modifying the example we had would be very simple:

Pseudo-implementation - scatter-gather

interface Email {
	text: string;
	sender: string;
	headers: string[];
}


interface SpamRule {
	/**
	 * function accepting email and returning score modifier
	 * @param email
	 * @returns
	 */
	processEmail: (email: Email) => Promise<number>
}

class SpamScatterGather {
	protected rules: SpamRule[] = [];

	addRules(rules: SpamRule[]) {
		this.rules.push(...rules);
	}

	async execute(email: Email): Promise<boolean> {
		// parallel processing
		// process all requests in parallel and wait until completion to continue
		const scores = await Promise.all(this.rules.map(rule => rule.processEmail(email)));
		const score = scores.reduce((acc, cur) => {
			return acc + cur;
		}, 0);
		return score > 0;
	}
}


class SpamText implements SpamRule {
	async processEmail(email: Email): Promise<number> {
		// process
		return 1;
	}
}

class SpamSender implements SpamRule {
	async processEmail(email: Email): Promise<number> {
		// process
		return 1;
	}
}

class SpamHeaders implements SpamRule {
	async processEmail(email: Email): Promise<number> {
		// process
		return 1;
	}
}


/************************************
 * programm.mts
 */

const spamSG = new SpamScatterGather();
spamSG.addRules([
	new SpamText(),
	new SpamSender(),
	new SpamHeaders(),
]);

console.log(await spamSG.execute({
	sender: 'john.doe@company.com',
	headers: ['from: john.doe@company.com'],
	text: 'this is spam',
}));

Depending on the language, it is then a question of the complexity of utilizing threads/workers/processes.

For effective use, however, it is important that the main computational load is located in parallelized processes and not in the main thread of your application.

Headshot Bronislav Klučka, Sep 26, 2025, 09:44 AM

Comments

Leave a comments

Comments are reviewed before published on these pages.

Unable to add a comment.

Your comment has been submitted and is awaiting review.

Your comment is too long.

We use cookies and similar technologies, such as Google Analytics, to analyze website traffic. This allows us to understand how visitors interact with our site.

More info

This website uses Google Analytics, a web analytics service offered by Google. Google Analytics uses cookies to help us analyze how users interact with our site. The information generated by the cookie about your use of the website (including your IP address) will be transmitted to and stored by Google. We use this information to compile reports on website activity and to provide other services related to website and internet activity.

Analytics cookies help us improve our services. We do not use them for marketing or advertising purposes. We do not sell this data.