Borrar filtros
Borrar filtros

I am embedding pdf into an Image. But after extraction i am getting blank page pdf. How to extract the correct pdf file whatever i have inserted?

3 visualizaciones (últimos 30 días)
% Embedding PDF file into an image using LSB substitution
% Set the file names for the PDF file and the cover image
pdfFileName = 'UMNwriteup.pdf';
imageFileName = 'glioma.jpg';
% Read the PDF file as binary
pdfData = fileread(pdfFileName);
pdfData = uint8(pdfData);
% Read the cover image
coverImage = imread(imageFileName);
% Get the dimensions of the cover image
[rows, columns, ~] = size(coverImage);
% Calculate the maximum number of bytes that can be embedded
maxBytes = (rows * columns * 3) / 8;
% Check if the PDF file size exceeds the maximum embedding capacity
if numel(pdfData) > maxBytes
error('PDF file size exceeds the maximum embedding capacity of the cover image.');
end
% Convert the PDF data into binary format
pdfBinary = de2bi(pdfData, 8, 'left-msb');
pdfBinary = pdfBinary(:);
% Get the number of bits to be embedded
numBits = numel(pdfBinary);
% Reshape the cover image to match the number of bits
coverImage = reshape(coverImage, [], 1);
% Embed the PDF data into the cover image using LSB substitution
coverImage(1:numBits) = bitset(coverImage(1:numBits), 1, pdfBinary);
% Reshape the modified cover image back to the original dimensions
coverImage = reshape(coverImage, rows, columns, 3);
% Save the stego image with the embedded PDF data
stegoImageFileName = 'stego_image.png';
imwrite(coverImage, stegoImageFileName);
% Extraction of PDF file from the stego image
% Read the stego image
stegoImage = imread(stegoImageFileName);
% Reshape the stego image into a single column
stegoImage = reshape(stegoImage, [], 1);
% Extract the embedded PDF data from the stego image
extractedPDFBinary = bitget(stegoImage(1:numBits), 1);
% Reshape the extracted binary data into bytes
extractedPDFData = reshape(extractedPDFBinary, [], 8);
extractedPDFData = bi2de(extractedPDFData, 'left-msb');
% Convert the extracted PDF data from uint8 to char
extractedPDFData = char(extractedPDFData);
% Write the extracted PDF data to a file
outputFileName = 'extracted.pdf';
fid = fopen(outputFileName, 'w');
fwrite(fid, extractedPDFData, 'uint8');
fclose(fid);
disp('Extraction complete.');
disp(['The extracted PDF file has been saved as: ' outputFileName]);

Respuesta aceptada

Image Analyst
Image Analyst el 3 de Jul. de 2023
See my attached stego/hiding/watermarking demos. Maybe there is something there that you can use or adapt. Good luck.
  1 comentario
Ramya
Ramya el 6 de Jul. de 2023
% Demo by Image Analyst to hide an audio signal in a uint8 gray scale image by encoding it in the least significant bit.
%============================================================================================================================================
% Initialization Steps.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 15;
markerSize = 4;
%============================================================================================================================================
% Read in image
% grayImage = imread('moon.tif'); % This image is too small to contain the sound file.
baseFileName = 'glioma.jpg';
grayImage = imread(baseFileName); % A uint8 image.
numPixels = numel(grayImage);
subplot(2, 2, 1);
imshow(grayImage, []);
impixelinfo;
caption = sprintf('Original Image : "%s"', baseFileName);
title(caption, 'FontSize', fontSize)
%============================================================================================================================================
% Read in demo audio file that ships with MATLAB.
[y, fs] = audioread("button-1.wav");
y = y(:); % If stereo, stack left channel on top of right channel.
% Get the time axis
t = linspace(0, length(y) / fs, length(y))';
subplot(2, 2, [3,4]);
plot(t, y, 'b-');
grid on;
yline(0, 'Color', 'k'); % Draw line along x axis.
title('Original Audio Signal vs. Time that is Encoded into Above Right Image', 'FontSize', fontSize)
xlabel('Time (seconds)', 'FontSize', fontSize)
ylabel('Audio Signal', 'FontSize', fontSize)
ylim([-1, 1]);
xlim([0, max(t)]);
drawnow; % Force immediate screen painting so we can see the inputs while the encoding and decoding process go on.
% Play the sound.
% playerObject = audioplayer(y, fs);
% play(playerObject)
% Convert y to 16 bits to have enough resolution so that the sound signal value won't be changed much due to round off error.
y16 = uint16(rescale(y, 0, 65535));
% Sound was converted to uint16 so we will need 16 pixels to store to store one sound value.
% Make an output image initialized to the same as the original image.
stegoImage = grayImage;
% See if the image is big enough to hide all bits of the audio signal.
numPixelsRequired = length(y) * 16;
if numPixels < numPixelsRequired
errorMessage = sprintf('Cannot fit image.\nThe image is %d elements long.\nThe sound is %d elements long.\nThe sound file needs the image to have at least %d pixels (= 16 * %d) to contain the entire sound.\n', ...
numPixels, length(y), numPixelsRequired, length(y))
uiwait(errordlg(errorMessage));
else
fprintf('The image is %d elements long.\nThe sound is %d elements long.\nThe sound file needs the image to have at least %d pixels (= 16 * %d) to contain the entire sound.\n', ...
numPixels, length(y), numPixelsRequired, length(y));
end
%============================================================================================================================================
% Now encode the audio signal in the least significant bit of the uint8 gray scale image.
for k = 1 : numel(y16)
binaryNumberString = dec2bin(y16(k), 16);
if mod(k, 10000) == 0
fprintf('Changing pixel #%d of %d.\n', k, numel(y16));
end
% fprintf('%d in uint16 is %s in binary', y16(k), binaryNumberString);
imageIndex = (k - 1) * 16 + 1;
these16GrayLevels = stegoImage(imageIndex : imageIndex + 15);
for k2 = 1 : length(binaryNumberString)
gl = these16GrayLevels(k2);
if binaryNumberString(k2) == '1'
% Image gray level needs to be odd.
% If the gray level is not already odd, make it odd.
if rem(gl, 2) == 0 % if it's even...
% It's even. Add 1 to it to make it odd.
gl = gl + 1;
stegoImage(imageIndex + k2 - 1) = gl;
end
else % binaryNumberString(k2) == '0'
% Image gray level needs to be even.
% If the gray level is not already even, make it even.
if rem(gl, 2) == 1 % If it's odd...
% It's odd. Add 1 to it to make it even, unless it's already 255 because we can't have a value of 256 for a uint8 variable.
if gl <= 254
gl = gl + 1;
else
% Value is 255 initially. Make it even by making it 254, since we can't do 256.
gl = 254;
end
stegoImage(imageIndex + k2 - 1) = gl;
end
end
end
% these16GrayLevelsNow = stegoImage(imageIndex : imageIndex + 15)
end
subplot(2, 2, 2);
imshow(stegoImage, []);
impixelinfo;
title('With embedded sound file', 'FontSize', fontSize)
drawnow;
% Maximize the figure.
g = gcf;
g.WindowState = 'maximized';
g.Name = 'Demo by Image Analyst';
g.NumberTitle = 'off';
%============================================================================================================================================
% Now undo the encoding process and recover the hidden sound
% by looking at the last (least significant) bit of the image gray levels and assigning that to a new sound.
yExtracted = zeros(length(y16), 1);
counter = 1;
for k = 1 : 16 : numel(y16) * 16
% Get a vector of 16 pixel values.
these16GrayLevels = stegoImage(k : k+15);
% Get the least significant bits of those 16 pixel values.
b = bitget(these16GrayLevels, 1);
% Convert to a string, then to a decimal number.
soundValue = bin2dec(sprintf('%c', b+48));
if mod(counter, 10000) == 0
fprintf('Assigning sound sample #%d of %d.\n', counter, numel(y16));
end
% Assign that sound value to the output sound signal.
yExtracted(counter) = soundValue;
counter = counter + 1;
end
% Convert back y from uint16 back to floating point like the original y.
yRecovered = yExtracted / 65535;
%============================================================================================================================================
% Play the recovered sound.
playerObject = audioplayer(yRecovered, fs);
play(playerObject)
% Double check that they're the same.
% If they're the same, the value below should be 1, true.
theyAreTheSame = isequal(y16, yExtracted).
%Am getting the output as The image is 786432 elements long.
The sound is 17640 elements long.
The sound file needs the image to have at least 282240 pixels (= 16 * 17640) to contain the entire sound.
Changing pixel #10000 of 17640.
Assigning sound sample #10000 of 17640.
Warning: No audio outputs were found. > In audiovideo.internal/audioplayerOnline/hasNoAudioHardware (line 491)
In audiovideo.internal/audioplayerOnline/initialize (line 327)
In audiovideo.internal.audioplayerOnline (line 175)
In audioplayer (line 134)
In LSB_hide_audio_in_image (line 135) Warning: No audio outputs were found. > In audiovideo.internal/audioplayerOnline/hasNoAudioHardware (line 491)
In audiovideo.internal/audioplayerOnline/play (line 200)
In audioplayer/play (line 349)
In LSB_hide_audio_in_image (line 136)
theyAreTheSame =
logical
1
y?

Iniciar sesión para comentar.

Más respuestas (1)

DGM
DGM el 2 de Jul. de 2023
Editada: DGM el 2 de Jul. de 2023
fileread() is really just a convenience wrapper for fread() meant for reading text files. Up until R2020-something, it didn't even have a means to even specify the encoding. It just blindly read the file using a default presumed encoding. In this case, it's likely reading the data using a two-byte encoding, so casting the char vector as uint8 destroys the data.
If you want to read a binary file strictly bytewise, just use fread(...,'*uint8') instead of trying to work around the automatic encoding detection used by fileread() or by fread(...,'*char')
% Read the PDF file as binary
fid = fopen(pdfFileName,'r');
pdfData = fread(fid,'*uint8');
fclose(fid);
See the bottom of the table here for the comments on how char inputs are handled:
  3 comentarios
Ramya
Ramya el 3 de Jul. de 2023
% Embedding larger PDF file into an image using chunk-wise LSB substitution
% Set the file names for the PDF file and the cover image
pdfFileName = 'UMNwriteup.pdf';
imageFileName = 'glioma.jpg';
% Embedding larger PDF file into an image using chunk-wise LSB substitution
% Read the PDF file as binary
pdfData = fileread(pdfFileName);
pdfData = uint8(pdfData);
% Read the cover image
coverImage = imread(imageFileName);
% Get the dimensions of the cover image
[rows, columns, ~] = size(coverImage);
% Calculate the maximum number of bytes that can be embedded per chunk
maxBytesPerChunk = (rows * columns * 3) / 8;
% Divide the PDF data into chunks
numChunks = ceil(numel(pdfData) / maxBytesPerChunk);
chunks = cell(numChunks, 1);
for i = 1:numChunks
startIndex = (i-1) * maxBytesPerChunk + 1;
endIndex = min(i * maxBytesPerChunk, numel(pdfData));
chunks{i} = pdfData(startIndex:endIndex);
end
% Embed each chunk into the cover image using LSB substitution
for i = 1:numChunks
chunk = chunks{i};
% Convert the chunk data into binary format
chunkBinary = de2bi(chunk, 8, 'left-msb');
chunkBinary = chunkBinary(:);
% Get the number of bits to be embedded
numBits = numel(chunkBinary);
% Reshape the cover image to match the number of bits
coverImage = reshape(coverImage, [], 1);
% Embed the chunk data into the cover image using LSB substitution
coverImage(1:numBits) = bitset(coverImage(1:numBits), 1, chunkBinary);
% Reshape the modified cover image back to the original dimensions
coverImage = reshape(coverImage, rows, columns, 3);
% Save the stego image with the embedded chunk data
stegoImageFileName = sprintf('stego_image_chunk%d.png', i);
imwrite(coverImage, stegoImageFileName);
end
disp('Embedding complete.');
% Extraction of PDF file from the stego image
% Initialize the extracted PDF data
extractedPDFData = [];
% Extract each chunk from the stego images and append to the extracted PDF data
for i = 1:numChunks
% Read the stego image
stegoImageFileName = sprintf('stego_image_chunk%d.png', i);
stegoImage = imread(stegoImageFileName);
% Reshape the stego image into a single column
stegoImage = reshape(stegoImage, [], 1);
% Extract the embedded chunk data from the stego image
extractedChunkBinary = bitget(stegoImage, 1);
extractedChunkBinary = reshape(extractedChunkBinary, [], 8);
% Convert the extracted binary data to uint8
extractedChunkData = uint8(bi2de(extractedChunkBinary, 'left-msb'));
% Append the extracted chunk data to the complete PDF data
extractedPDFData = [extractedPDFData; extractedChunkData];
end
% Convert the extracted PDF data to a character array
charPDFData = char(extractedPDFData.');
% Write the character array to a temporary file
tempFileName = 'temp_pdf_file.bin';
fid = fopen(tempFileName, 'w');
fwrite(fid, charPDFData, 'char');
fclose(fid);
% Convert the temporary file to PDF using MATLAB's built-in function
outputFileName = 'extracted.pdf';
systemCommand = ['java -jar pdfbox-app-2.0.25.jar ExtractText -console "' tempFileName '" > "' outputFileName '"'];
[status, result] = system(systemCommand);
if status == 0
disp('Extraction complete.');
disp(['The extracted PDF file has been saved as: ' outputFileName]);
else
disp('Extraction failed.');
end
DGM
DGM el 3 de Jul. de 2023
I don't know. I don't know where that jar file comes from, but I can't run that on my installation.
Using fileread() like that with the PDFs I've tested on my system in R2019b will reliably result in the read data being corrupted (values are different, wrong number of bytes returned). A char is not necessarily 1 byte.

Iniciar sesión para comentar.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by