Issues executing bulk data transfers using the SQL Server bcp utility.

Juhi Bhatnagar 0 Reputation points

SQL Server Tools & Utilities (bcp utility)
We attempted to use bcp out with the native format flag (-n) to stream data through a Linux named pipe (FIFO), the process encounters data corruption. This issue specifically triggers when the table contains text columns populated with embedded Unicode, binary data, or special characters.

Why This is Failing (Technical Context)

Through our investigation, we discovered a conflict between how BCP handles character data and how the Linux kernel manages named pipe buffers:

  1. The Encoding Trap: Even when Native Mode (-n) is active, the BCP client recognizes text columns as character data and attempts to apply code page translation between the SQL Server collation and the Linux client locale.
  2. Byte Mutation: Because the customer's data contains raw Unicode strings and special character byte sequences inside a standard text field, BCP's translation engine mangles the bytes mid-stream, altering the data's length and structure.
  3. Pipe Misalignment: A Linux named pipe expects strict byte boundaries. When BCP streams this mutated data, its internal block-size calculations get thrown off. The Linux pipe reader reads a corrupted byte as an EOF marker (causing truncation).

Questions to Microsoft Support

  1. Native Mode Behavior:** Why does the Linux BCP client attempt character code page translation on text/LOB columns when the Native Format (-n) flag is explicitly set? Is -C RAW an undocumented requirement when exporting text columns containing arbitrary Unicode/binary content?**
  2. Linux FIFO Bug: Is there a known limitation or buffer-handling bug within mssql-tools BCP when streaming raw/native binary data blocks through a POSIX named pipe (FIFO) on Linux?
  3. Long-Term Fix: Is a fix planned for the Linux BCP client to properly handle kernel pipe streaming without choking on block boundaries, or is writing to a flat file the only officially supported architecture?
  4. Configuration: Are there specific TDS packet sizes (-a) or Linux kernel pipe capacity configurations required to make Linux named pipes stable under heavy BCP binary streams?
  1. Pilladi Padma Sai Manisha 10,190 Reputation points Microsoft External Staff Moderator

    Hi @Juhi Bhatnagar

    Thanks for the detailed description. A few clarifications may help.

    1. Native mode (-n) and code page translation

    bcp -n exports data using SQL Server native format and is not expected to perform character code page translation in the same way as character mode (-c). Therefore, -C RAW is generally not required or applicable when using native format.

    The assumption that -n is altering Unicode or binary content due to code page conversion may not be accurate, and additional investigation is needed to identify where the corruption is occurring.

    1. Linux FIFO (named pipe) support

    There is no documented Microsoft limitation or known issue stating that Linux FIFOs are unsupported with bcp. However, Microsoft primarily documents and validates scenarios involving exporting to files or standard streams. Streaming directly through a POSIX named pipe is a less common integration pattern and may expose issues outside of SQL Server itself.

    1. Recommended troubleshooting

    Please capture the following information:

    • SQL Server version and build (SELECT @@VERSION)
    • bcp / mssql-tools version
    • Linux distribution and kernel version
    • Table schema and data types involved (text, varchar(max), nvarchar(max), varbinary(max), etc.)
    • Exact bcp command being executed
    • Whether the issue reproduces when exporting to a regular file instead of a FIFO

    If the issue disappears when writing to a flat file, this would help isolate the problem to the FIFO streaming path rather than native BCP serialization.

    1. Packet size and tuning

    The -a packet size option can improve performance, but there are no documented packet-size or Linux pipe-buffer requirements needed for correctness or stability. Adjusting these settings is unlikely to resolve data corruption by itself.

    Official documentation:

    If the issue is reproducible only on Linux and only when streaming through a FIFO, collecting the above details would be the next step to determine whether this is a mssql-tools client defect or an unsupported integration pattern rather than expected bcp -n behavior.

  2. Pilladi Padma Sai Manisha 10,190 Reputation points Microsoft External Staff Moderator

    Hi @Juhi Bhatnagar
    I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

  3. Juhi Bhatnagar 0 Reputation points

    Hey @Pilladi Padma Sai Manisha ,

    Following are the requested details.

    • SQL Server Version is : Microsoft SQL Azure (RTM) - 12.0.2000.8 Apr 14 2026 20:27:12 Copyright (C) 2025 Microsoft Corporation
    • BCP : # bcp -v BCP - Bulk Copy Program for Microsoft SQL Server. Copyright (C) Microsoft Corporation. All Rights Reserved. Version: 18.6.0002.1
    • Linux: OS=Ubuntu 22.04.5 LTS , Kernel version = 6.8.0-1052-azure
    • Table schema and data types involved:
      Audit table: CREATE TABLE dbo.Audit ( ID INT IDENTITY(1,1) NOT NULL, OperationGUID CHAR(100) NOT NULL, OperationType INT NULL, ReasonGUID CHAR(100) NOT NULL, [XML] TEXT NULL, -- Holds the audited transaction's payload CreationDate DATETIME NOT NULL DEFAULT (GETUTCDATE()), UserID INT NOT NULL, -- Primary Key Constraint CONSTRAINT PK_Audit PRIMARY KEY CLUSTERED (ID) ); GO -- Chronological Index optimization as defined in your model schema CREATE NONCLUSTERED INDEX IDX_Audit_CreationDate ONdbo.Audit (CreationDate ASC); GO
      Error: on [XML] TEXT NULL.
    • BCP OUT: bcp "dbo"."Audit" out "namedpipe.bcp" -n S <server> -U <user> -P <Password> -d <db>  -a 16383 -b 50000 -u -r "DrvaRL"
      We stream data from named pipe to audit.bcp BCP IN: bcp "dbo"."Audit" in "audit.bcp" -n -S <server> -U <user> -P <Password> -d <db> -a 16383 -b 50000 -E -u -r "DrvaRL" -q
    • Issue is not reproducible when using regular file instead of a FIFO.

Sign in to comment

2 answers

  1. Erland Sommarskog 134.7K Reputation points MVP Volunteer Moderator

    First, the text data type is deprecated, and for a good reason. Convert to varchar(MAX).

    BCP will apply character conversion even if with native format, because native format does mean that all code-page issues are removed. If you export data in native mode from one database with a Spanish collation and import it into a Polish collation, the character ñ will not survive, since this character is not available in varchar (or text) in a Polish collation with a legacy cod page.

    You can use the -C option to specify the code page of the source data, but I think that it is better to use a format file and specify the collation in the format file.

    0 comments No comments

    Sign in to comment
  2. AI answer

    The available documentation does not describe or confirm:

    • Any special or undocumented requirement such as -C RAW for text/LOB columns in native mode (-n).
    • Any known bug or limitation specific to using the Linux bcp client with POSIX FIFOs (named pipes) that would cause corruption or EOF misinterpretation.
    • Any planned fix or roadmap item for improving bcp behavior with Linux named pipes.
    • Any required or recommended TDS packet size (-a) or Linux kernel pipe configuration specifically for stabilizing bcp native/binary streams over FIFOs.

    From the provided material, the following points are documented and relevant:

    • Native mode (-n) is intended for SQL Server–to–SQL Server transfers and uses native (database) data types for noncharacter data and Unicode for character data, with better performance than -w and without prompting for each field. It is recommended when transferring data that contains ANSI extended characters and when taking advantage of native-mode performance.
    • Unicode character mode (-w) uses nchar storage, tab (\t) as field separator, and newline (\n) as row terminator, and is not compatible with -c.
    • On Linux and macOS, the default field terminator is tab (\t) and line terminator is newline (\n), and backslashes in command-line arguments must be escaped. Native mode (-n) is recommended for SQL Server–to–SQL Server transfers; character mode (-c) is recommended when crossing non–SQL Server systems or when the data file should not contain extended characters.
    • bcp enforces data validation and checks (for example, verifying that Unicode data has an even-byte length). Invalid data that might have loaded in older versions can now fail during bulk load.
    • The -a option is documented only as TDS version selection (-V in the context, not -a), and there is no mention of pipe-specific tuning.
    • The documentation explicitly notes that bcp supports native data files compatible with all supported SQL Server versions, but it does not extend this guarantee to streaming through FIFOs.

    Because the context does not cover:

    • Internal code-page handling for text/LOB columns in native mode on Linux,
    • FIFO-specific behavior or bugs,
    • Any special guidance for using bcp with named pipes,
    • Or any roadmap commitments,

    no further authoritative details can be provided beyond the general bcp behavior and options described above.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.
Sign in to answer

Your answer