[Openexr-devel] Problem with deep data

Discussion:

Richard Hadsell

2018-04-19 19:25:21 UTC

I have been attempting to add a deep-data part to a multipart file. Everything compiles and runs, but the data I read back from the file do not match in any way the data I wrote to the file. I have tried simplifying as much as possible and stepping
through the code in a debugger.

(BTW, the description of DeepSlice's memory layout in ImfDeepFrameBuffer.h is wrong. It doesn't describe the layout of pointers to arrays of samples.)

It looks to me like all my pointers are correct, but the data returned by Xdr::read() are not the data I sent to Xdr::write(). The headers look fine, and the RGBA image looks fine, but the deep-data part is not.

I have been using version 2.2.0, but I also tried 2.2.1 with no success.

Has anyone had similar problems with deep data? How can I debug this failure?

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Piotr Stanczyk

2018-04-19 19:46:04 UTC

Permalink

What kind of differences are you seeing? Is there a potential color space
conversion you are seeing?

You could visualize the data in Nuke or the native (FLTK alas) viewer

-Piotr

Post by Richard Hadsell
I have been attempting to add a deep-data part to a multipart file.
Everything compiles and runs, but the data I read back from the file do not
match in any way the data I wrote to the file. I have tried simplifying as
much as possible and stepping through the code in a debugger.
(BTW, the description of DeepSlice's memory layout in ImfDeepFrameBuffer.h
is wrong. It doesn't describe the layout of pointers to arrays of samples.)
It looks to me like all my pointers are correct, but the data returned by
Xdr::read() are not the data I sent to Xdr::write(). The headers look fine,
and the RGBA image looks fine, but the deep-data part is not.
I have been using version 2.2.0, but I also tried 2.2.1 with no success.
Has anyone had similar problems with deep data? How can I debug this failure?
--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831
<https://maps.google.com/?q=1+American+Lane,+Greenwich,+CT+06831&entry=gmail&source=g>
-2560
_______________________________________________
Openexr-devel mailing list
https://lists.nongnu.org/mailman/listinfo/openexr-devel

Richard Hadsell

2018-04-19 19:56:51 UTC

Permalink

All channels are floats, and I am seeing mostly 0's and denormalized numbers. I haven't seen any relationship between the outgoing and incoming data. It looks like the pointer into the file is in the wrong place; that is, it's not in the same place for
reading and writing.

Nuke is a good idea. I'll try to find a Nuke user here that can figure out how to look at deep data. That would at least tell me whether the problem is during the write or the read.

What kind of differences are you seeing? Is there a potential color space conversion you are seeing?
You could visualize the data in Nuke or the native (FLTK alas) viewer
-Piotr
I have been attempting to add a deep-data part to a multipart file. Everything compiles and runs, but the data I read back from the file do not match in any way the data I wrote to the file. I have tried simplifying as much as possible and stepping
through the code in a debugger.
(BTW, the description of DeepSlice's memory layout in ImfDeepFrameBuffer.h is wrong. It doesn't describe the layout of pointers to arrays of samples.)
It looks to me like all my pointers are correct, but the data returned by Xdr::read() are not the data I sent to Xdr::write(). The headers look fine, and the RGBA image looks fine, but the deep-data part is not.
I have been using version 2.2.0, but I also tried 2.2.1 with no success.
Has anyone had similar problems with deep data? How can I debug this failure?

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studioshttp://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Michael Wolf

2018-04-19 21:45:27 UTC

Permalink

_______________________________________________
Openexr-devel mailing list
Openexr-***@nongnu.org
https://lists.nongnu.org/mailman/listinfo/openexr-devel

Peter Hillman

2018-04-19 22:36:16 UTC

Permalink

Hi Richard,

It may also be instructive to confirm the library built correctly by
building and running the IlmImfTest suite. Running IlmImfTest with
"deep" as an argument will run only the deep tests.

The source code of those tests should provide further examples of how to
read/write deep data. The different tests intentionally use slightly
different approaches to read/write data. You might modify one of those
tests to disable the file cleanup, which would generate a deep file you
can read with your own code, and compare to the known values written
into the file.

Are you getting the correct sample counts but entirely incorrect data?
That would suggest you have the pointer-to-arrays pointing to the wrong
memory locations. If some of the values are correct (e.g. only the first
pixel in the image, only the first pixel on each row, or only the first
sample of each pixel) that would suggest the pointers are correct, but
the yPixelStride,xPixelStride,sampleStride (respectively) values are wrong.

Try writing a small amount of 32 bit float data (e.g. a 2x2 pixel image
with 1 channel) with compression set to NO_COMPRESSION and check the
file contents in a hex editor: the last 4 bytes of the file should be
the last sample of the last pixel of the last channel in the file. That
might tell you whether you are writing the file correctly.

Re: [Openexr-devel] Problem with deep data
Hallo Richard,
This is a snippet from my simple deep reader (half RGBA, float Z)-
minus all the cleanup.
It was written some time ago as a simple test, but it works for the
purpose. Maybe it helps you find the issue.
--- snip ---
Imf::Array2D< unsigned int > sampleCount;
Imf::Array2D< half* > dataR, dataG, dataB, dataA;
void readDeepExr(const char *filename)
{
Â Imf::DeepScanLineInputFile file(filename);
Â const Imf::Header &header = file.header();
Â dataWindow = header.dataWindow();
Â displayWindow = header.displayWindow();
Â width = dataWindow.max.x - dataWindow.min.x + 1;
Â height = dataWindow.max.y - dataWindow.min.y + 1;
Â sampleCount.resizeEraseUnsafe(height, width);
Â Imf::Array2D< float* >dataZ(height, width);
Â dataR.resizeEraseUnsafe(height, width);
dataG.resizeEraseUnsafe(height, width);
Â dataB.resizeEraseUnsafe(height, width);
dataA.resizeEraseUnsafe(height, width);
Â Imf::DeepFrameBuffer frameBuffer;
Â frameBuffer.insertSampleCountSlice (Imf::Slice (Imf::UINT,
Â Â (char *) (&sampleCount[0][0] - dataWindow.min.x - dataWindow.min.y
* width),
Â Â sizeof (unsigned int) * 1, // xStride
Â Â sizeof (unsigned int) * width)); // yStride
Â frameBuffer.insert ("Z",
Â Â Imf::DeepSlice (Imf::FLOAT, (char *) (&dataZ[0][0] -
dataWindow.min.x - dataWindow.min.y * width),
Â Â sizeof (float *) * 1, // xStride for pointer array
Â Â sizeof (float *) * width, // yStride for pointer array
Â Â sizeof (float) * 1)); // stride for Z data sample
Â frameBuffer.insert ("R",
Â Â Imf::DeepSlice (Imf::HALF, (char *) (&dataR[0][0] -
dataWindow.min.x - dataWindow.min.y * width),
Â Â sizeof (half *), // xStride for pointer array
Â Â sizeof (half *) * width, // yStride for pointer array
Â Â sizeof (half))); // stride for O data sample
Â frameBuffer.insert ("G",
Â Â Imf::DeepSlice (Imf::HALF, (char *) (&dataG[0][0] -
dataWindow.min.x - dataWindow.min.y * width),
Â Â sizeof (half *), sizeof (half *) * width, sizeof (half)));
Â frameBuffer.insert ("B",
Â Â Imf::DeepSlice (Imf::HALF, (char *) (&dataB[0][0] -
dataWindow.min.x - dataWindow.min.y * width),
Â Â sizeof (half *), sizeof (half *) * width, sizeof (half)));
Â frameBuffer.insert ("A",
Â Â Imf::DeepSlice (Imf::HALF, (char *) (&dataA[0][0] -
dataWindow.min.x - dataWindow.min.y * width),
Â Â sizeof (half *), sizeof (half *) * width, sizeof (half)));
Â file.setFrameBuffer(frameBuffer);
Â file.readPixelSampleCounts(dataWindow.min.y, dataWindow.max.y);
Â for (int y = 0; y < height; y++)
Â {
Â Â for (int x = 0; x < width; x++)
Â Â {
Â Â Â int s = sampleCount[y][x];
Â Â Â dataZ[y][x] = new float[s];
Â Â Â dataR[y][x] = new half[s];
Â Â Â dataG[y][x] = new half[s];
Â Â Â dataB[y][x] = new half[s];
Â Â Â dataA[y][x] = new half[s];
Â Â }
Â }
Â file.readPixels(dataWindow.min.y, dataWindow.max.y);
Â std::cout << "Done.\n";
Â // clean up etc...
}
--- snip ---
Cheers,
Mike
/--
db&w Bornemann und Wolf GbR
Seyfferstr. 34
70197 Stuttgart
Deutschland
/
/http://www.db-w.com
/
tel: +49 (711) 664 525-3
fax: +49 (711) 664 525-1
mob: +49 (173) 66 37 652
skype: lupus_lux
_______________________________________________
Openexr-devel mailing list
https://lists.nongnu.org/mailman/listinfo/openexr-devel

Richard Hadsell

2018-04-20 18:50:39 UTC

Permalink

Your last suggestion was most helpful. I had already examined the pointers and strides, set the part to NO_COMPRESSION, and generated only 1 sample per pixel, but I had not tried it with tiny images.

I found that an image with a single scanline and up to 256 pixels was okay. However, 257 pixels resulted in the first pixel (pix[0]) having junk and the other pixels being shifted, so that pix[1-256] had the values that should have been in pix[0-255].
This was the result from reading the file.

Using 'od' to look at the file showed that the last float in the file was, indeed, the sample for pix[256]. Maybe the write was okay, but the read was not.

(Testing with 258 pixels in the scanline resulted in bad data for pix[0-1], and values in pix[2-257] were the samples that should have been in pix[0-255].)

For the test with 257 pixels, I looked at the data in TotalView and saw that readPtr, as calculated in line 673 of ImfDeepScanLineInputFile.cpp, points to floats that correspond to the samples that are bad. The first float is junk, and the next 256
samples are those that should have been in pix[0-255]. The float after that was 0, not the value that should have been in pix[256], so I conclude that the data in the buffer were not read correctly from the file.

Of course, I don't know enough about the layout of data in the file to know whether the problem is in reading or writing. I can see that the samples are there in the file, but maybe they are in the wrong place.

I hope this is enough information for someone to duplicate the problem. I don't think I can take it any farther myself.

Post by Peter Hillman
Hi Richard,
It may also be instructive to confirm the library built correctly by building and running the IlmImfTest suite. Running IlmImfTest with "deep" as an argument will run only the deep tests.
The source code of those tests should provide further examples of how to read/write deep data. The different tests intentionally use slightly different approaches to read/write data. You might modify one of those tests to disable the file cleanup, which
would generate a deep file you can read with your own code, and compare to the known values written into the file.
Are you getting the correct sample counts but entirely incorrect data? That would suggest you have the pointer-to-arrays pointing to the wrong memory locations. If some of the values are correct (e.g. only the first pixel in the image, only the first
pixel on each row, or only the first sample of each pixel) that would suggest the pointers are correct, but the yPixelStride,xPixelStride,sampleStride (respectively) values are wrong.
Try writing a small amount of 32 bit float data (e.g. a 2x2 pixel image with 1 channel) with compression set to NO_COMPRESSION and check the file contents in a hex editor: the last 4 bytes of the file should be the last sample of the last pixel of the
last channel in the file. That might tell you whether you are writing the file correctly

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Richard Hadsell

2018-04-20 19:06:34 UTC

Permalink

Here is another clue that might help someone find the bug:

The shift in data is independent of the number of channels in the deep-data part. When I tested with 6 channels (named A, B, G, R, X, and Y), the channels are written and read in that (alphabetic) order. Looking at the bad data read for pix[0] I
discovered that pix[0].A was garbage, but pix[0].B was pix[256].A. Each channel of pix[0] was the previous channel's value for the last sample. It looks like the data were in the buffer, but the start pointer was off by one float.

Post by Richard Hadsell
Your last suggestion was most helpful. I had already examined the pointers and strides, set the part to NO_COMPRESSION, and generated only 1 sample per pixel, but I had not tried it with tiny images.
I found that an image with a single scanline and up to 256 pixels was okay. However, 257 pixels resulted in the first pixel (pix[0]) having junk and the other pixels being shifted, so that pix[1-256] had the values that should have been in pix[0-255].
This was the result from reading the file.
Using 'od' to look at the file showed that the last float in the file was, indeed, the sample for pix[256]. Maybe the write was okay, but the read was not.
(Testing with 258 pixels in the scanline resulted in bad data for pix[0-1], and values in pix[2-257] were the samples that should have been in pix[0-255].)
For the test with 257 pixels, I looked at the data in TotalView and saw that readPtr, as calculated in line 673 of ImfDeepScanLineInputFile.cpp, points to floats that correspond to the samples that are bad. The first float is junk, and the next 256
samples are those that should have been in pix[0-255]. The float after that was 0, not the value that should have been in pix[256], so I conclude that the data in the buffer were not read correctly from the file.
Of course, I don't know enough about the layout of data in the file to know whether the problem is in reading or writing. I can see that the samples are there in the file, but maybe they are in the wrong place.
I hope this is enough information for someone to duplicate the problem. I don't think I can take it any farther myself.

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Peter Hillman

2018-04-22 21:13:39 UTC

Permalink

Duplicating the problem will be hard without seeing your code. If the
IlmImfTest deep tests are passing, but your code is failing, that would
make me suspect there's either a bug in your own code, or that you are
using the API in an unusual way - different to how the tests use it -
that's triggering a bug we've not seen before. The IlmImfTest suite does
write images similar to the ones you've been testing with.

A couple of perhaps more obvious things to double-check: make sure that
you don't have a mysterious 8 bit variable there that's wrapping round
in a weird way, or some calculation that's casting to a 'char' instead
of 32 or 64 bit value. That could fit the your symptoms, particularly if
it's in the part of the code where you set up the array of pointers to
float arrays to store each pixel. Also, a common slip-up with OpenEXR is
forgetting that the data and display windows are /inclusive/ - a 256
pixel wide image has displayWindow.max.x set to 255.

On linux, you might try running your code through valgrind to see if it
identifies any issues with accessing uninitialised or out-of-bound memory.

In your descriptions you don't mention how many samples per pixel are
being written. Perhaps try writing 257 pixel wide scanline with one data
sample in each pixel, then an image where the first or the last pixel
has many samples and all the rest have 0 samples. This might shed light
on whether the odd behaviour you are seeing is dependent on the total
number of samples written, or the total number of pixels. You can also
try writing with an offset dataWindow (e.g. a 256 pixel wide dataWindow
with dataWindow.min.x = 100 and dataWindow.max.x=355) to see whether the
256 pixel problem is relative to the dataWindow or the displayWindow.

Post by Richard Hadsell
The shift in data is independent of the number of channels in the
deep-data part. When I tested with 6 channels (named A, B, G, R, X,
and Y), the channels are written and read in that (alphabetic) order.
Looking at the bad data read for pix[0] I discovered that pix[0].A was
garbage, but pix[0].B was pix[256].A. Each channel of pix[0] was the
previous channel's value for the last sample. It looks like the data
were in the buffer, but the start pointer was off by one float.

Your last suggestion was most helpful. I had already examined the
pointers and strides, set the part to NO_COMPRESSION, and generated
only 1 sample per pixel, but I had not tried it with tiny images.
I found that an image with a single scanline and up to 256 pixels was
okay. However, 257 pixels resulted in the first pixel (pix[0])
having junk and the other pixels being shifted, so that pix[1-256]
had the values that should have been in pix[0-255]. This was the
result from reading the file.
Using 'od' to look at the file showed that the last float in the file
was, indeed, the sample for pix[256]. Maybe the write was okay, but
the read was not.
(Testing with 258 pixels in the scanline resulted in bad data for
pix[0-1], and values in pix[2-257] were the samples that should have
been in pix[0-255].)
For the test with 257 pixels, I looked at the data in TotalView and
saw that readPtr, as calculated in line 673 of
ImfDeepScanLineInputFile.cpp, points to floats that correspond to the
samples that are bad. The first float is junk, and the next 256
samples are those that should have been in pix[0-255]. The float
after that was 0, not the value that should have been in pix[256], so
I conclude that the data in the buffer were not read correctly from
the file.
Of course, I don't know enough about the layout of data in the file
to know whether the problem is in reading or writing. I can see that
the samples are there in the file, but maybe they are in the wrong place.
I hope this is enough information for someone to duplicate the
problem. I don't think I can take it any farther myself.

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Blue Sky Studioshttp://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Richard Hadsell

2018-04-23 15:46:32 UTC

Permalink

Duplicating the problem will be hard without seeing your code. If the IlmImfTest deep tests are passing, but your code is failing, that would make me suspect there's either a bug in your own code, or that you are using the API in an unusual way -
different to how the tests use it - that's triggering a bug we've not seen before. The IlmImfTest suite does write images similar to the ones you've been testing with.
A couple of perhaps more obvious things to double-check: make sure that you don't have a mysterious 8 bit variable there that's wrapping round in a weird way, or some calculation that's casting to a 'char' instead of 32 or 64 bit value. That could fit
the your symptoms, particularly if it's in the part of the code where you set up the array of pointers to float arrays to store each pixel. Also, a common slip-up with OpenEXR is forgetting that the data and display windows are /inclusive/ - a 256 pixel
wide image has displayWindow.max.x set to 255.
On linux, you might try running your code through valgrind to see if it identifies any issues with accessing uninitialised or out-of-bound memory.
In your descriptions you don't mention how many samples per pixel are being written. Perhaps try writing 257 pixel wide scanline with one data sample in each pixel, then an image where the first or the last pixel has many samples and all the rest have 0
samples. This might shed light on whether the odd behaviour you are seeing is dependent on the total number of samples written, or the total number of pixels. You can also try writing with an offset dataWindow (e.g. a 256 pixel wide dataWindow with
dataWindow.min.x = 100 and dataWindow.max.x=355) to see whether the 256 pixel problem is relative to the dataWindow or the displayWindow.

My tests were using 1 sample per pixel. I verified in TotalView that the pointers I set up are correct, and the OpenEXR code is accessing them correctly.

I will try to test other variations of sample numbers, as you suggest. And I will also try to run it through valgrind.

Meanwhile, I will also try to work around the problem by using tiles that are no more than 256 pixels wide.

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studioshttp://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560

Richard Hadsell

2018-04-27 18:43:49 UTC

Permalink

Post by Richard Hadsell

Duplicating the problem will be hard without seeing your code. If the IlmImfTest deep tests are passing, but your code is failing, that would make me suspect there's either a bug in your own code, or that you are using the API in an unusual way -
different to how the tests use it - that's triggering a bug we've not seen before. The IlmImfTest suite does write images similar to the ones you've been testing with.
A couple of perhaps more obvious things to double-check: make sure that you don't have a mysterious 8 bit variable there that's wrapping round in a weird way, or some calculation that's casting to a 'char' instead of 32 or 64 bit value. That could fit
the your symptoms, particularly if it's in the part of the code where you set up the array of pointers to float arrays to store each pixel. Also, a common slip-up with OpenEXR is forgetting that the data and display windows are /inclusive/ - a 256 pixel
wide image has displayWindow.max.x set to 255.
On linux, you might try running your code through valgrind to see if it identifies any issues with accessing uninitialised or out-of-bound memory.
In your descriptions you don't mention how many samples per pixel are being written. Perhaps try writing 257 pixel wide scanline with one data sample in each pixel, then an image where the first or the last pixel has many samples and all the rest have 0
samples. This might shed light on whether the odd behaviour you are seeing is dependent on the total number of samples written, or the total number of pixels. You can also try writing with an offset dataWindow (e.g. a 256 pixel wide dataWindow with
dataWindow.min.x = 100 and dataWindow.max.x=355) to see whether the 256 pixel problem is relative to the dataWindow or the displayWindow.

My tests were using 1 sample per pixel. I verified in TotalView that the pointers I set up are correct, and the OpenEXR code is accessing them correctly.
I will try to test other variations of sample numbers, as you suggest. And I will also try to run it through valgrind.
Meanwhile, I will also try to work around the problem by using tiles that are no more than 256 pixels wide.

I found the problem in our code. It was a mistake in the return code from the read function in our I/O class that derives from IStream. The EOF indication was reversed. In ordinary reads, this was ignored. Only when called from a skip function did the
bug have an effect. It stopped the skipping over the sample counts after the first 1024 bytes (256 pixels).

Thank you for your suggestions. Testing the effects of various numbers of samples in various places showed nothing new, but it forced me to follow the actual read operations in TotalView, which uncovered the skip function that was returning early.

--
Dick Hadsell 203-992-6320 Fax: 203-992-6001
Reply-to: ***@blueskystudios.com
Blue Sky Studios http://www.blueskystudios.com
1 American Lane, Greenwich, CT 06831-2560