Attachments extraction

During the attachments extraction phase, the snap-in retrieves attachments from the external system and uploads them to DevRev. This phase occurs after data extraction, transformation, and loading are completed.

Triggering event

Event types

EventDirectionDescription
EXTRACTION_ATTACHMENTS_STARTAirdrop → Snap-inInitiates the attachments extraction
EXTRACTION_ATTACHMENTS_PROGRESSSnap-in → AirdropIndicates process is ongoing but runtime limit (13 minutes) reached
EXTRACTION_ATTACHMENTS_DELAYSnap-in → AirdropRequests a delay due to rate limiting from external system
EXTRACTION_ATTACHMENTS_CONTINUEAirdrop → Snap-inResumes the extraction process after progress update or delay
EXTRACTION_ATTACHMENTS_DONESnap-in → AirdropSignals successful completion of attachments extraction
EXTRACTION_ATTACHMENTS_ERRORSnap-in → AirdropIndicates that an error occurred during extraction

Implementation

Default implementation

The SDK provides a default implementation for attachments extraction. If the default behavior (iterating through attachment metadata and uploading from saved URLs) meets your needs, no additional implementation is required.

Custom implementation

If you need to customize the attachments extraction, modify the implementation in attachments-extraction.ts. Use the streamAttachments function from the WorkerAdapter class, which handles most of functionality needed for this phase:

1const response = await adapter.streamAttachments({
2 stream: getFileStream,
3 batchSize: 10
4});

Parameters:

  • stream: (Required) Function that handles downloading attachments from the external system
  • batchSize: (Optional) Number of attachments to process simultaneously (default: 1)

Increasing the batch size (from the default 1) can significantly improve performance. But be mindful of lambda memory constraints and external system rate limits when choosing batch size. A batch size between 10 and 50 typically provides good results.

Example 'stream' function
1async function getFileStream({
2 item,
3}: ExternalSystemAttachmentStreamingParams): Promise<ExternalSystemAttachmentStreamingResponse> {
4 const { id, url } = item;
5
6 try {
7 const fileStreamResponse = await axiosClient.get(url, {
8 responseType: 'stream',
9 headers: {
10 'Accept-Encoding': 'identity',
11 },
12 });
13
14 return { httpStream: fileStreamResponse };
15 } catch (error) {
16 if (axios.isAxiosError(error)) {
17 console.warn(`Error while fetching attachment ${id} from URL.`, serializeAxiosError(error));
18 console.warn('Failed attachment metadata', item);
19 } else {
20 console.warn(`Error while fetching attachment ${id} from URL.`, error);
21 console.warn('Failed attachment metadata', item);
22 }
23
24 return {
25 error: {
26 message: `Failed to fetch attachment ${id} from URL.`,
27 },
28 };
29 }
30}

Emitting responses

The snap-in must send exactly one response to Airdrop when extraction is complete:

Success response
1await adapter.emit(ExtractorEventType.ExtractionAttachmentsDone);
Delay response (for rate limiting)
1await adapter.emit(ExtractorEventType.ExtractionAttachmentsDelay, {
2 delay: "30", // Delay in seconds
3});
Error response
1await adapter.emit(ExtractorEventType.ExtractionAttachmentsError, {
2 error: "Informative error message",
3});
The snap-in must always emit exactly one response event.
Built with