Announcement

**oshunluvr** · Oct 29, 2024, 09:45 AM

OK, well never mind sort of. The above errors were because the file got corrupted during one of the conversions and the corruption started at line 1833286. I'm going to try it all again.

**oshunluvr** · Oct 29, 2024, 10:01 AM

Tried again with a fresh csv conversion. Since "file" reports it as ascii I tried both us-ascii and utf8 and got these errors:

postgres-# postgres=# \copy schema.table from '/tmp/sourcefile.csv' DELIMITER ',' ENCODING 'sql-ascii' CSV HEADER;
ERROR: unquoted carriage return found in data
HINT: Use quoted CSV field to represent carriage return.
CONTEXT: COPY sourcefile, line 195432

postgres-# postgres=# \copy schema.table from '/tmp/sourcefile.csv' DELIMITER ',' ENCODING 'utf8' CSV HEADER;
ERROR: invalid byte sequence for encoding "UTF8": 0xbd
CONTEXT: COPY sourcefile, line 1725
postgres-#

**oshunluvr** · Oct 29, 2024, 10:18 AM

I was able to open the file with Kate (took almost a full minute to open), then changed the encoding to UTF8. Kate reported "non-UTF8" characters in the file. So I saved it and then I ran

Code:

iconv -f utf8 -t utf8 sourcefile.csv -o sourcefile1.csv

Diff showed there was a lot of changes made, so here I go to postgres try again...

**oshunluvr** · Oct 29, 2024, 10:58 AM

OK I got it to load. The file still had some invalid characters in it, but when I set encoding to sql-ascii instead of utf8, it loaded all the lines. I suspect there may be a few fields that are corrupted or missing, but I can live with that for now.

**Snowhog** · Oct 29, 2024, 11:26 AM

Originally posted by oshunluvr View Post

I have a text file that is 8,000,000+ lines in length.

Good gawd man, what magical tomb do you have! An 8x11 page in portrait orientation will have an average of 66 single spaced lines, so your document is 121,212.121212 pages long!

**oshunluvr** · Oct 29, 2024, 11:54 AM

I know, right?

It's actually nearly 9 million lines.

8,944,327 rows and 67 columns. Not all the fields are populated.

**oshunluvr** · Oct 29, 2024, 12:01 PM

I'm surprised at how quickly the server processes queries considering the size of database. Super simple ones. but still, that's a lot of data. My postgres server is Kubuntu 24.04 running in a QEMU VM.

**Snowhog** · Oct 29, 2024, 02:48 PM

Originally posted by oshunluvr View Post

8,944,327 rows and 67 columns.

That's 599,269,909 cells! That's enough room to stuff every American into a cell, and still have 253,247,093 cells left over! Like I said: Oh my gawd!

Announcement

Have a heck of a time getting a file "cleaned" so postgres 'copy' will accept it. "iconv" not working or???

[RESOLVED] Have a heck of a time getting a file "cleaned" so postgres 'copy' will accept it. "iconv" not working or???

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment