sandbox/sander/output_htg.h
#include "output_pvd.h"
#include "utils.h"
This field enables the export of AMR-meshes into the HyperTreeGrid
Dataformat. It works on Quad-/Octrees with and without MPI. The HTG file
format is still under development so compatibility with future Paraview
versions might be limited. Advantage of the HTG file format is the ~7x
smaller size due to implicit point location. The data is stored under
the path
variable, a time series can be opend using the
corresponding collection file (.pvd) with name “output_
output_htg()
is a wrapper function that additionally
creates a collection file pointing to different time steps of an htg
simulation. If MPI is used the HTG is written with collective
MPI_File_IO. If no MPI is used standard POSIX fopen/fclose operations
are used.
Be sure to used the updated, MPI_IO compatible
output_pvd.h
!
Example Usage
event report(i+=10){
char path[]="htg"; // no slash at the end!!
char prefix[80];
sprintf(prefix, "data_%06d", i);
output_htg((scalar ) {cs, d,p},(vector ){u}, path, prefix, i, t);
}
The Collection File Name includes the used path. For the example above the Collection File is called “output_htg”. This allows to run a parameter study based on the same binary if you include the current case configuration in the path name. (e.g. Level, Reynolds number, velocity, acceleration, …)
Flags
#define HEADER_MIN_MAX_VAL 1
If this flag is set the corresponding min/max values for each values are included in the header. This is probably not needed, the adavantage is unknown. It probably increases the posprocessing speed. Defaults to 1 (true).
#define WRITE_HTG_LEVEL 1
If this flag is set the level of each cell is exported as well and selectable during postprocessing. Defaults to 0 (false).
#define HTG_SPEED_STATS 1
If this flag is set information about the write speed are posted (to stdout). Defaults to 0 (false).
#define VTK_FILE_VERSION 20
If this flag is set to 20, the created HyperTreeGrid has the VTKFile version 2.0. It is supported starting with Paraview Version 5.10. Defaults to 10 (VTKFile version 1.0)
Known HTG Problems
- HyperTreeGridToDualGrid stops working when advancing time step (Paraview related)
- Contour Filter does not work on HTG with only one tree (like this exporter creates) (Paraview related)
- x-z Axis are swapped (3D), x-y Axis are swapped (2D)
Details of the parallel implementation
The mapping of the Basilisk internal tree structure to the HyperTreeGrid structure is not trivial. The HTG structure is defined level wise. Multiple processors can write a single tree, if each processes writes its cells level wise after those from the preceeding process. The offsets and cells per level have to recalculated every output step. These offsets and cells per level are used to create a derived datatype (MPI_Type_indexed). Each process copys the values of all its nodes into a continous array. The MPI_Type_indexed datatype allows to write this continous array as a correclty spaced array to the file, which allows other processes to write their data in the spaces left. The HyperTreeGrid further needs the size of the following data in bytes in addition to the actual data.
_| size_in_bytes_of_the_following_blob |_actual_data_scalar1 | size...blob |_ac...data |...
It is therefore convenient to define a struct consisting of the an integer with the blob size and the array of the data to write. This can be done using an MPI_Type_struct datatype. While the struct contains the interger and a pointer to a continous space in memory the data on the file will be the integer followed by the correctly spaced MPI_Type_indexed.
This enables using MPI_File_set_view and MPI_File_write_all (collective operation) for good IO-performance.
The header and the tail is written only by process 0.
The actual tree structure of the HTG ist defined using a bitmask. If a node has children its value is 1, else it is 0. This is done level wise as well. On the next level the value of each (still existing) node is described using 1 (is refined/has childen) or 0 (not refined/ does not have childen) as well. This is repeated for each level except the last level because these values are all 0 due to being the last level and are neglected.
The problem arises when storing this bitmask, as it is only possible
the write a byte (8 bits). If a process on a level does not have a
number of nodes that is devisible by 8 (a whole byte), logical gaps in
the bitmask appear which are not allowed. This problem arises
immediately at level 0, as process 0 has one 1 bit (level zero has
children), but writing a 1000 0000
byte is wrong as it
already describes the values of the 7 following nodes as well, whose
values are unknown to process 0. Therefore the describtor bits (stored
as bytes (u_int8_t) in memory) have to be moved to the next process (or
level) until a number divisible by 8 is resident on the process. A
‘wave’ of excess descriptor bytes is swept through all processes and
levels til the last process of the last level who fills the missing
bytes with 0.
To allow this moving of upto 7 bytes a special array structure is chosen:
|8_empty_byt|descr_byt_lvl0|7_empty_byt|descr_byt_lvl1|…|7_empty_byt| 0 8
Each process has to space to accommodate upto 7 bytes from the preceeding process/level infront of the own descriptor bytes. Each process keeps track of how many bytes to send/ to recv/ and stay on a level and on which index the level begins taking the recvieved bytes into account. After this every process has a number of descriptor bytes that can be transformed into descriptor bits. This is actually done in the same array starting a index 0 which is guaranteed to not contain any data. In byte 0 the information of the next 8 relevant bytes is written as bits.
With information about offset and length again a MPI_Type_indexed and a MPI_Type struct can be defined. This again is used to write the bitmask in a collective operation.
Detail of the serial implementation
Each process caches the data on each level in a contingues array which is written to disk sequentially with only one fwrite() funtion call. The Bitmask is written in “appened” mode as well resulting in an uncluttered file
//~ #define HEADER_MIN_MAX_VAL 0
//~ #define WRITE_HTG_LEVEL 1
//~ #define HTG_SPEED_STATS 1
//~ #define VTK_FILE_VERSION 20 // 2.0
#ifndef HEADER_MIN_MAX_VAL
#define HEADER_MIN_MAX_VAL 1
#endif
#ifndef WRITE_HTG_LEVEL
#define WRITE_HTG_LEVEL 0
#endif
#ifndef HTG_SPEED_STATS
#define HTG_SPEED_STATS 0
#endif
#ifndef VTK_FILE_VERSION
#define VTK_FILE_VERSION 10 // 1.0
#endif
void output_htg(scalar * list, vector * vlist, const char* path, char* prefix, int i, double t);
#if _MPI
void output_htg_data_mpiio(scalar * list, vector * vlist, MPI_File fp);
#else
void output_htg_data(scalar * list, vector * vlist, FILE * fp);
#endif
#if _MPI
void output_htg(scalar * list, vector * vlist, const char* path, char* prefix, int i, double t) {
;
MPI_File fp
int ec;
char htg_name[80];
sprintf(htg_name, "%s/%s.htg", path, prefix);
= MPI_File_open(MPI_COMM_WORLD, htg_name, \
ec | MPI_MODE_CREATE, MPI_INFO_NULL, &fp);
MPI_MODE_WRONLY // Overwrite File
if (ec == MPI_ERR_FILE_EXISTS) {
printf("ERR, htg_name exists!\n");
// MPI_File_delete(htg_name, MPI_INFO_NULL);
(MPI_COMM_WORLD, htg_name, \
MPI_File_open| MPI_MODE_CREATE, MPI_INFO_NULL, &fp);
MPI_MODE_WRONLY (fp, 0);
MPI_File_set_size}
if(ec != MPI_SUCCESS){
printf("output_htg.h : %s could not be opened\n Does the Folder exist?\n", htg_name);
(MPI_COMM_WORLD, 2);
MPI_Abort}
output_htg_data_mpiio((scalar *) list,(vector *)vlist,fp);
(&fp);
MPI_File_close
if(pid()==0){
bool firstTimeWritten = false;
char pvd_name[80];
sprintf(pvd_name,"output_%s.pvd",path);
= MPI_File_open(MPI_COMM_SELF, pvd_name, \
ec , MPI_INFO_NULL, &fp);
MPI_MODE_RDWR
if( (i == 0) || (ec == MPI_ERR_NO_SUCH_FILE) ) {
// Overwrite File
// Best implementation??
(MPI_COMM_SELF, pvd_name, \
MPI_File_open|MPI_MODE_CREATE, MPI_INFO_NULL, &fp);
MPI_MODE_WRONLY(fp, 0);
MPI_File_set_size= true;
firstTimeWritten }
output_pvd_mpiio(htg_name, t, fp, firstTimeWritten);
(&fp);
MPI_File_close}
(MPI_COMM_WORLD);
MPI_Barrier}
#else // no MPI
void output_htg(scalar * list, vector * vlist, const char* path, char* prefix, int i, double t) {
FILE * fp ;
char htg_name[80];
sprintf(htg_name, "%s/%s.htg", path, prefix);
= fopen(htg_name, "w");
fp if(!fp){
printf("output_htg.h : %s could not be opened\n Does the Folder exist?\n", htg_name);
exit(1);
}
output_htg_data((scalar *) list,(vector *) vlist, fp);
fclose(fp);
bool firstTimeWritten = false;
char pvd_name[80];
sprintf(pvd_name,"output_%s.pvd",path);
= fopen(pvd_name, "r+");
fp if( (i == 0) || (fp == NULL) ) {
= fopen(pvd_name,"w");
fp = true;
firstTimeWritten }
output_pvd(htg_name, t, fp, firstTimeWritten);
fclose(fp);
}
#endif
#if _MPI
#define Write2File(x) do { \
(x); \
(fp,&buffer, strlen(buffer), MPI_CHAR, MPI_STATUS_IGNORE); \
MPI_File_write} while(0)
void output_htg_data_mpiio(scalar * list, vector * vlist, MPI_File fp)
{
#if defined(_OPENMP)
int num_omp = omp_get_max_threads();
(1);
omp_set_num_threads#endif
unsigned int vertices_local = 0;
unsigned int descBits_local = 0;
unsigned int vertices_local_pL[grid->maxdepth+1]; // pL = per Level
unsigned int descBits;
unsigned int vertices;
unsigned int vertices_pL[grid->maxdepth+1];
for (int lvl = 0; lvl < grid->maxdepth+1; ++lvl){
[lvl] = 0;
vertices_local_pL(lvl,serial)
foreach_levelif(is_local(cell))
[lvl]++;
vertices_local_pL
+= vertices_local_pL[lvl];
vertices_local }
= vertices_local - vertices_local_pL[grid->maxdepth]; //
descBits_local
(&vertices_local_pL[0], &vertices_pL[0], grid->maxdepth + 1,\
MPI_Reduce, MPI_SUM, 0, MPI_COMM_WORLD);
MPI_UNSIGNED
#if HTG_SPEED_STATS
(MPI_COMM_WORLD);
MPI_Barrierdouble start = MPI_Wtime();
#endif
= 0;
MPI_Offset offset
int vertices_global_offset[grid->maxdepth+1];
[0] = 0;
vertices_global_offsetunsigned int carryover = 0;
for (int lvl = 0; lvl <= grid->maxdepth; ++lvl){
(&vertices_local_pL[lvl], \
MPI_Exscan &vertices_global_offset[lvl], \
1, \
, \
MPI_UNSIGNED, \
MPI_SUM);
MPI_COMM_WORLD// Pass Offset to process 0 for Exscan on next level
if (pid() == (npe() - 1)) {
unsigned int next_offset;
= vertices_global_offset[lvl] + vertices_local_pL[lvl];
next_offset (&next_offset, 1, MPI_UNSIGNED, 0, 0, MPI_COMM_WORLD);
MPI_Ssend}
if (pid() == 0) {
[lvl] -= carryover; // temporarily remove offset added in previous level
vertices_local_pL
(&carryover, 1, MPI_UNSIGNED, npe() - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv
if (lvl < grid->maxdepth) {
[lvl+1] += carryover; // temporarily use array to carry over offset to next level
vertices_local_pL[lvl+1] = carryover;
vertices_global_offset}
else= carryover;
vertices if (lvl == grid->maxdepth - 1)
= carryover;
descBits }
}
(&vertices, 1, MPI_UNSIGNED, 0, MPI_COMM_WORLD);
MPI_Bcast(&descBits, 1, MPI_UNSIGNED, 0, MPI_COMM_WORLD); MPI_Bcast
- Calculate min,max for xml header (prob. not needed?)
#if HTG_SPEED_STATS
(MPI_COMM_WORLD);
MPI_Barrierdouble t_stats_start = MPI_Wtime();
#endif
double min_val[list_len(list)];
double max_val[list_len(list)];
{
int i = 0;
for (scalar s in list) {
#if HEADER_MIN_MAX_VAL
stats stat = statsf(s);
[i]=stat.min;
min_val[i]=stat.max;
max_val#else
[i]=0.;
min_val[i]=0.;
max_val#endif
++;
i}
}
double min_val_v[vectors_len(vlist)];
double max_val_v[vectors_len(vlist)];
{
int i = 0;
for (vector v in vlist) {
#if HEADER_MIN_MAX_VAL
[i]= HUGE;
min_val_v[i]= -HUGE;
max_val_vforeach_dimension(){
stats stat = statsf(v.x);
[i] = min(stat.min,min_val_v[i]);
min_val_v[i]= max(stat.max,max_val_v[i]);
max_val_v}
#else
[i]= 0.;
min_val_v[i]= 0.;
max_val_v#endif
++;
i}
}
#if HTG_SPEED_STATS
(MPI_COMM_WORLD);
MPI_Barrierdouble t_stats = MPI_Wtime() - t_stats_start ;
double t_header_start = MPI_Wtime();
#endif
char buffer[256];
File Header vtkHyperTreeGrid
if (pid() == 0){
//~ WriteHeader(&offset, infoStruct);
int maj_v = VTK_FILE_VERSION/10, min_v = VTK_FILE_VERSION%10;
(sprintf(buffer,\
Write2File"<VTKFile %s version=\"%i.%i\" %s %s>\n",\
"type=\"HyperTreeGrid\"", \
, min_v , \
maj_v"byte_order=\"LittleEndian\" ",\
"header_type=\"UInt32\""));
#if dimension==2
(sprintf(buffer,\
Write2File"\t<HyperTreeGrid BranchFactor=\"2\" TransposedRootIndexing=\"0\" Dimensions=\"%d %d %d\">\n",\
2,2,1));
#elif dimension==3
(sprintf(buffer,\
Write2File"\t<HyperTreeGrid BranchFactor=\"2\" TransposedRootIndexing=\"0\" Dimensions=\"%d %d %d\">\n", 2,2,2));
#endif
(sprintf(buffer,"\t\t<Grid>\n"));
Write2File#if dimension==2
(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"XCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",Y0, Y0+L0));
Write2File(sprintf(buffer, "\t\t\t\t%g %g",Y0, Y0+L0));
Write2File(sprintf(buffer,"\n\t\t\t</DataArray>\n"));
Write2File(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"YCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",X0, X0+L0));
Write2File(sprintf(buffer, "\t\t\t\t%g %g",X0, X0+L0));
Write2File(sprintf(buffer,"\n\t\t\t</DataArray>\n"));
Write2File(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"ZCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",Z0, Z0+L0));
Write2File(sprintf(buffer, "\t\t\t\t%g %g", 0., 0.));
Write2File#elif dimension==3
(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"XCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
Write2File, Z0+L0));
Z0(sprintf(buffer, "\t\t\t\t%g %g",Z0, Z0+L0));
Write2File(sprintf(buffer,"\n\t\t\t</DataArray>\n"));
Write2File(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"YCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
Write2File, Y0+L0));
Y0(sprintf(buffer, "\t\t\t\t%g %g",Y0, Y0+L0));
Write2File(sprintf(buffer,"\n\t\t\t</DataArray>\n"));
Write2File(sprintf(buffer, "\t\t\t<DataArray type=\"Float64\" Name=\"ZCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
Write2File, X0+L0));
X0(sprintf(buffer, "\t\t\t\t%g %g",X0, X0+L0));
Write2File#endif
(sprintf(buffer,"\n\t\t\t</DataArray>\n"));
Write2File(sprintf(buffer,"\t\t</Grid>\n"));
Write2File(sprintf(buffer,"\t\t<Trees>\n"));
Write2File
// Tree Begin
unsigned int byte_offset = 0;
#if VTK_FILE_VERSION == 10
(sprintf(buffer,"\t\t\t<Tree Index=\"0\" NumberOfLevels=\"%d\" NumberOfVertices=\"%u\">\n",\
Write2File->maxdepth + 1, vertices));
grid
// Array Descriptor Bits
(sprintf(buffer,"\t\t\t\t<DataArray type=\"Bit\" Name=\"Descriptor\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"1\" offset=\"%u\"/>\n",\
Write2File, byte_offset));
descBits+= (descBits/8+1) * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset
// Array NbVerticesByLevel
(sprintf(buffer,"\t\t\t\t<DataArray type=\"Int64\" Name=\"NbVerticesByLevel\" NumberOfTuples=\"%d\" format=\"ascii\" RangeMin=\"1\" RangeMax=\"%u\" >\n\t\t\t\t\t",\
Write2File->maxdepth + 1, vertices_pL[grid->maxdepth]));
grid
for (int lvl = 0; lvl <= grid->maxdepth; lvl++) {
(sprintf(buffer,"%u ", vertices_pL[lvl]));
Write2File}
(sprintf(buffer,"\n\t\t\t\t</DataArray>\n"));
Write2File#endif
#if VTK_FILE_VERSION == 20
// Array Descriptor Bits
(sprintf(buffer,"\t\t\t\t<DataArray type=\"Bit\" Name=\"Descriptors\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"1\" offset=\"%u\"/>\n",\
Write2File, byte_offset));
descBits+= (descBits/8+1) * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset
(sprintf(buffer,"\t\t\t\t<DataArray type=\"Int64\" Name=\"NumberOfVerticesPerDepth\" NumberOfTuples=\"%d\" format=\"ascii\" RangeMin=\"1\" RangeMax=\"%u\" >\n\t\t\t\t\t",\
Write2File->maxdepth + 1, vertices_pL[grid->maxdepth]));
grid
for (int lvl = 0; lvl <= grid->maxdepth; lvl++) {
(sprintf(buffer,"%u ", vertices_pL[lvl]));
Write2File}
(sprintf(buffer,"\n\t\t\t\t</DataArray>\n"));
Write2File
(sprintf(buffer,"\t\t\t\t<DataArray type=\"Int64\" Name=\"TreeIds\" NumberOfTuples=\"1\" format=\"ascii\" RangeMin=\"0\" RangeMax=\"0\" >\n"));
Write2File(sprintf(buffer,"\t\t\t\t\t%i\n",0));
Write2File(sprintf(buffer,"\t\t\t\t</DataArray>\n"));
Write2File
(sprintf(buffer,"\t\t\t\t<DataArray type=\"UInt32\" Name=\"DepthPerTree\" NumberOfTuples=\"1\" format=\"ascii\" RangeMin=\"%i\" RangeMax=\"%i\" >\n",\
Write2File->maxdepth+1, grid->maxdepth+1));
grid(sprintf(buffer,"\t\t\t\t\t%i\n",grid->maxdepth+1));
Write2File(sprintf(buffer,"\t\t\t\t</DataArray>\n"));
Write2File
(sprintf(buffer,"\t\t</Trees>\n"));
Write2File#endif
// Cell Data Begin
(sprintf(buffer,"\t\t\t\t<CellData>\n"));
Write2File
#if WRITE_HTG_LEVEL
(sprintf(buffer,"\t\t\t\t\t<DataArray type=\"UInt8\" Name=\"Level\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"%i\" offset=\"%u\"/>\n",\
Write2File, grid->maxdepth, byte_offset));
vertices+= vertices * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset #endif
{
int i = 0;
for (scalar s in list)
{
(sprintf(buffer,"\t\t\t\t\t<DataArray type=\"Float32\" Name=\"%s\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"%g\" RangeMax=\"%g\" offset=\"%u\"/>\n",\
Write2File.name, vertices, min_val[i], max_val[i], byte_offset));
s+= vertices * sizeof(float_t) + sizeof(u_int32_t);
byte_offset ++;
i}
}
{
int i = 0;
for (vector v in vlist)
{
char *vname = strtok(v.x.name, ".");
(sprintf(buffer, "\t\t\t\t\t<DataArray type=\"Float32\" NumberOfComponents=\"%i\" Name=\"%s\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"%g\" RangeMax=\"%g\" offset=\"%u\"/>\n",\
Write2File3,vname, vertices, min_val_v[i],max_val_v[i],byte_offset));
+= vertices * 3 * sizeof(float_t) + sizeof(u_int32_t);
byte_offset ++;
i}
}
// Cell Data End
(sprintf(buffer,"\t\t\t\t</CellData>\n"));
Write2File#if VTK_FILE_VERSION == 10
(sprintf(buffer,"\t\t\t</Tree>\n\t\t</Trees>\n"));
Write2File#endif
(sprintf(buffer,"\t</HyperTreeGrid>\n\t<AppendedData encoding=\"raw\">\n_"));
Write2File;
MPI_Offset offset_tmp(fp, &offset_tmp);
MPI_File_get_position+= offset_tmp;
offset } // end pid() == 0
(&offset, 1, MPI_OFFSET, 0, MPI_COMM_WORLD);
MPI_Bcast#if HTG_SPEED_STATS
double t_header = MPI_Wtime() - t_header_start;
#endif
Write Bitmask
int cell_size;
{
=sizeof(u_int8_t);
cell_sizeint vertices_local_pL_offset[grid->maxdepth+1];
Create an array large enough to hold data + spacing between the levels
long length_w_spacing = descBits_local + 7*(grid->maxdepth+1) + 8;
* mask = (u_int8_t*)calloc( length_w_spacing, cell_size); //calloc needed?
u_int8_t
long index = 8; // start a 8
for(int lvl=0; lvl < grid->maxdepth;++lvl) { // iterate through array and place lokal mask.
[lvl] = index;
vertices_local_pL_offset(lvl,serial) {
foreach_levelif (is_local(cell)) {
mask[index++] = (u_int8_t)(!is_leaf(cell));
}
}
+= 7; // make space for byte movement
index }
[grid->maxdepth] = index;
vertices_local_pL_offset
assert(length_w_spacing > index);
int vertices_local_pL_corr[grid->maxdepth+1];
for (int lvl = 0; lvl < grid->maxdepth; ++lvl) {
[lvl] = vertices_local_pL[lvl];
vertices_local_pL_corr}
int vertices_local_pL_offset_corr[grid->maxdepth]; // new offset
long vertices_local_corr = 0;
int num_from_prev = 0;
[7];
u_int8_t tmpfor (int lvl = 0; lvl < grid->maxdepth; ++lvl) {
for (int pe=0; pe < npe(); ++pe) {
int send_rank = pe;
int recv_rank = (pe+1) % npe(); // next higher rank + loop
if (pid() == send_rank) {
// work on prev send data
[lvl] += num_from_prev;
vertices_local_pL_corrint trg_position = vertices_local_pL_offset[lvl]-num_from_prev;
[lvl] = trg_position;
vertices_local_pL_offset_corrfor (int tmp_cnt = 0; tmp_cnt < num_from_prev; ++tmp_cnt)
{
// printf("pid: %i, lvl: %i trg_pos: %i, tmp_cnt: %i, lws: %li\n", pid(), lvl, trg_position, tmp_cnt,length_w_spacing); fflush(stdout);
mask[trg_position + tmp_cnt] = tmp[tmp_cnt];
}
int num_to_next = (vertices_local_pL_corr[lvl]%8);
[lvl] -= num_to_next;
vertices_local_pL_corr
// if last level and last process, append space
if ((lvl == grid->maxdepth -1 ) && (pid() == npe() -1) ) {
[lvl] += 8;
vertices_local_pL_corr}
+= vertices_local_pL_corr[lvl];
vertices_local_corr
// MPI_Ssend(&num_to_next, 1, MPI_INT, recv_rank, 0, MPI_COMM_WORLD);
(&num_to_next, 1, MPI_INT, recv_rank, 0, MPI_COMM_WORLD);
MPI_Send
int src_position = vertices_local_pL_offset[lvl+1] - 7 - num_to_next; // sum of vertices incl thsi level
// MPI_Ssend(&mask[src_position], num_to_next, MPI_UINT8_T, recv_rank, 0, MPI_COMM_WORLD);
(&mask[src_position], num_to_next, MPI_UINT8_T, recv_rank, 1, MPI_COMM_WORLD);
MPI_Send}
if (pid() == recv_rank) {
(&num_from_prev, 1, MPI_INT, send_rank, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv(&tmp[0], num_from_prev, MPI_UINT8_T, send_rank, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recv}
}
}
if (vertices_local_corr %8 != 0) // sanity check
(MPI_COMM_WORLD, 2);
MPI_Abort
//* TEMP EXPORT FILE STRUCTURE */
#if 0
FILE * fpt;
char exp_name[80];
sprintf(exp_name, "structure_pid%04i.dat", pid());
= fopen(exp_name, "w");
fpt fprintf(fpt, "%li\n", length_w_spacing);
for (int lvl = 0; lvl < grid->maxdepth; ++lvl) {
fprintf(fpt, "%i\n", vertices_local_pL_offset_corr[lvl]);
fprintf(fpt, "%i\n", vertices_local_pL_corr[lvl]) ;
}
fclose(fpt);
#endif
//* END TEMP EXPORT FILE STRUCTURE */
// zero copy bitmask creation
int i = 0, cnt = 0;
for (int lvl = 0; lvl < grid->maxdepth; ++lvl) {
int displacement = vertices_local_pL_offset_corr[lvl]; // offset of correct array within my array
int count = vertices_local_pL_corr[lvl]; // corrected count
for (int c = 0; c < count; ++c) {
mask[i] |= mask[displacement + c] << (7-cnt);
if (++cnt%8 == 0) {
mask[++i] = 0; // incr i and clear next byte
=0;
cnt}
}
}
We need to calculate the global offset of the bit packets for the * MPI_Type_indexed datatype describing the tree data
int vertices_global_offset_corr[grid->maxdepth]; // not "+1"
[0] = 0;
vertices_global_offset_corrunsigned int carryover = 0;
for (int lvl = 0; lvl < grid->maxdepth; ++lvl){
(&vertices_local_pL_corr[lvl], \
MPI_Exscan &vertices_global_offset_corr[lvl], \
1, \
, \
MPI_INT, \
MPI_SUM);
MPI_COMM_WORLD// Send Data to Process 0 for the Exscan on next Level
if (pid() == (npe() - 1)) {
unsigned int next_offset;
= vertices_global_offset_corr[lvl] + vertices_local_pL_corr[lvl];
next_offset (&next_offset, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
MPI_Ssend}
if (pid() == 0) {
[lvl] -= carryover; // remove temporary offset added in previous level
vertices_local_pL_corr
(&carryover, 1, MPI_INT, npe() - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
MPI_Recvif (lvl + 1 < grid->maxdepth ) {
[lvl+1] += carryover; // use array temporarily to carry over offset to next level
vertices_local_pL_corr[lvl+1] = carryover;
vertices_global_offset_corr}
}
}
Create a struct which will be written to RAM
struct descBit_t{
;
u_int32_t size* data;
u_int8_t}descBit_struct;
.size = (descBits/8 + 1) * cell_size;
descBit_struct.data = &mask[0]; descBit_struct
Define where the data is located in Memory
[2];
MPI_Aint m_displacements;
MPI_Aint base_address(&descBit_struct, &base_address);
MPI_Get_address(&descBit_struct.size, &m_displacements[0]);
MPI_Get_address(&descBit_struct.data[0], &m_displacements[1]);
MPI_Get_address[0] = MPI_Aint_diff(m_displacements[0], base_address);
m_displacements[1] = MPI_Aint_diff(m_displacements[1], base_address);
m_displacements
[2] = { MPI_UINT32_T, MPI_BYTE };
MPI_Datatype m_types//~
Tree has to have some data, else there is a segmentationfault when setting view
int m_lengths[2] = {1, vertices_local_corr/8}; // Bits
; // memory view
MPI_Datatype m_view(2, m_lengths, m_displacements, m_types, &m_view);
MPI_Type_create_struct(&m_view);
MPI_Type_commit
int lengths[grid->maxdepth];
int displacements[grid->maxdepth];
Define the File view for writing the bitmask [1/2]
for(int lvl = 0; lvl < grid->maxdepth; ++lvl) {
[lvl] = (int)vertices_local_pL_corr[lvl]/8; // bits, not bytes
lengths[lvl] = (int)vertices_global_offset_corr[lvl]/8; // bits, not bytes
displacements}
;
MPI_Datatype tree_type_descBit(grid->maxdepth, lengths, displacements, MPI_BYTE, &tree_type_descBit);
MPI_Type_indexed(&tree_type_descBit); MPI_Type_commit
Define the File view for writing the bitmask [2/2]
[2] = {0, sizeof(u_int32_t)};
MPI_Aint f_displacementsint f_lengths[2] = {1, 1};
[2] = { MPI_UINT32_T, tree_type_descBit};
MPI_Datatype f_types
; // file_view
MPI_Datatype f_view(2, f_lengths, f_displacements, f_types, &f_view);
MPI_Type_create_struct(&f_view); MPI_Type_commit
Set the view according to f_fiew (file_view)
(fp, offset, f_view, f_view, "native", MPI_INFO_NULL); MPI_File_set_view
Write from memory according to m_view (memory_view)
(fp, &descBit_struct, 1, m_view, MPI_STATUS_IGNORE);
MPI_File_write_all
+= (descBits/8+1) * sizeof(u_int8_t) + sizeof(u_int32_t);
offset
(&m_view);
MPI_Type_free(&f_view);
MPI_Type_free(&tree_type_descBit);
MPI_Type_free
free(mask); mask = NULL;
.data = NULL;
descBit_struct}
Write LEVEL
#if WRITE_HTG_LEVEL
{
struct level_t{
;
u_int32_t size* data;
u_int8_t}level_struct;
=sizeof(u_int8_t);
cell_size.size = vertices * cell_size;
level_struct.data = (u_int8_t*) malloc(vertices_local*cell_size);
level_struct
[2];
MPI_Aint m_displacements;
MPI_Aint base_address(&level_struct, &base_address);
MPI_Get_address(&level_struct.size, &m_displacements[0]);
MPI_Get_address(&level_struct.data[0], &m_displacements[1]);
MPI_Get_address[0] = MPI_Aint_diff(m_displacements[0], base_address);
m_displacements[1] = MPI_Aint_diff(m_displacements[1], base_address);
m_displacements
[2] = { MPI_UINT32_T, MPI_UINT8_T};
MPI_Datatype m_typesint m_lengths[2] = {1, vertices_local};
;
MPI_Datatype m_view(2, m_lengths, m_displacements, m_types, &m_view);
MPI_Type_create_struct(&m_view);
MPI_Type_commit
int lengths[grid->maxdepth+1];
int displacements[grid->maxdepth+1];
for(int lvl = 0; lvl < grid->maxdepth + 1; ++lvl){
[lvl] = (int) vertices_local_pL[lvl];
lengths[lvl] = (int) vertices_global_offset[lvl];
displacements}
;
MPI_Datatype tree_type_level(grid->maxdepth + 1, lengths, displacements, MPI_UINT8_T, &tree_type_level);
MPI_Type_indexed(&tree_type_level);
MPI_Type_commit
[2] = {0, sizeof(u_int32_t)};
MPI_Aint f_displacementsint f_lengths[2] = {1, 1};
[2] = { MPI_UINT32_T, tree_type_level};
MPI_Datatype f_types
;
MPI_Datatype f_view(2, f_lengths, f_displacements, f_types, &f_view);
MPI_Type_create_struct(&f_view);
MPI_Type_commit
long index = 0;
for(int lvl=0; lvl<=grid->maxdepth;++lvl)
(lvl,serial)
foreach_levelif (is_local(cell))
.data[index++] = (u_int8_t)lvl;
level_struct
(fp, offset, f_view,f_view, "native", MPI_INFO_NULL);
MPI_File_set_view(fp, &level_struct, 1, m_view, MPI_STATUS_IGNORE);
MPI_File_write_all
+= vertices*cell_size + sizeof(u_int32_t);
offset
free(level_struct.data); level_struct.data = NULL;
(&m_view);
MPI_Type_free(&f_view);
MPI_Type_free(&tree_type_level);
MPI_Type_free}
#endif // end WRITE_HTG_LEVEL
{
struct scalar_t{
;
u_int32_t sizefloat* data;
}scalar_struct;
=sizeof(float);
cell_size.size = vertices*cell_size;
scalar_struct.data = (float*) malloc(vertices_local*cell_size);
scalar_struct
[2];
MPI_Aint m_displacements;
MPI_Aint base_address(&scalar_struct, &base_address);
MPI_Get_address(&scalar_struct.size, &m_displacements[0]);
MPI_Get_address(&scalar_struct.data[0], &m_displacements[1]);
MPI_Get_address[0] = MPI_Aint_diff(m_displacements[0], base_address);
m_displacements[1] = MPI_Aint_diff(m_displacements[1], base_address);
m_displacements
[2] = { MPI_UINT32_T, MPI_FLOAT};
MPI_Datatype m_typesint m_lengths[2] = {1, vertices_local};
;
MPI_Datatype m_view(2, m_lengths, m_displacements, m_types, &m_view);
MPI_Type_create_struct(&m_view);
MPI_Type_commit
int lengths[grid->maxdepth+1];
int displacements[grid->maxdepth+1];
for(int lvl = 0; lvl < grid->maxdepth + 1; ++lvl){
[lvl] = (int) vertices_local_pL[lvl];
lengths[lvl] = (int) vertices_global_offset[lvl];
displacements}
;
MPI_Datatype tree_type_scalar(grid->maxdepth + 1, lengths, displacements, MPI_FLOAT, &tree_type_scalar);
MPI_Type_indexed(&tree_type_scalar);
MPI_Type_commit
[2] = {0, sizeof(u_int32_t)};
MPI_Aint f_displacementsint f_lengths[2] = {1, 1};
[2] = { MPI_UINT32_T, tree_type_scalar};
MPI_Datatype f_types
;
MPI_Datatype f_view(2, f_lengths, f_displacements, f_types, &f_view);
MPI_Type_create_struct(&f_view);
MPI_Type_commit
for (scalar s in list) {
long index = 0;
for(int lvl=0; lvl<=grid->maxdepth;++lvl)
(lvl,serial)
foreach_levelif (is_local(cell))
.data[index++] = (float)val(s);
scalar_struct
(fp, offset, f_view,f_view, "native", MPI_INFO_NULL);
MPI_File_set_view(fp, &scalar_struct, 1, m_view, MPI_STATUS_IGNORE);
MPI_File_write_all+= vertices*cell_size + sizeof(u_int32_t);
offset }
free(scalar_struct.data); scalar_struct.data = NULL;
(&m_view);
MPI_Type_free(&f_view);
MPI_Type_free(&tree_type_scalar);
MPI_Type_free}
{
struct vector_t{
;
u_int32_t sizefloat* data;
}vector_struct;
=3*sizeof(float);
cell_size.size = vertices*cell_size;
vector_struct.data = (float*) malloc(vertices_local*cell_size);
vector_struct
[2];
MPI_Aint m_displacements;
MPI_Aint base_address(&vector_struct, &base_address);
MPI_Get_address(&vector_struct.size, &m_displacements[0]);
MPI_Get_address(&vector_struct.data[0], &m_displacements[1]);
MPI_Get_address[0] = MPI_Aint_diff(m_displacements[0], base_address);
m_displacements[1] = MPI_Aint_diff(m_displacements[1], base_address);
m_displacements
[2] = { MPI_UINT32_T, MPI_FLOAT};
MPI_Datatype m_typesint m_lengths[2] = {1, 3*vertices_local};
;
MPI_Datatype m_view(2, m_lengths, m_displacements, m_types, &m_view);
MPI_Type_create_struct(&m_view);
MPI_Type_commit
int lengths[grid->maxdepth+1];
int displacements[grid->maxdepth+1];
for(int lvl = 0; lvl < grid->maxdepth + 1; ++lvl) {
[lvl] = 3*(int)vertices_local_pL[lvl];
lengths[lvl] = 3*(int)vertices_global_offset[lvl];
displacements}
;
MPI_Datatype tree_type_vector(grid->maxdepth + 1, lengths, displacements, MPI_FLOAT, &tree_type_vector);
MPI_Type_indexed(&tree_type_vector);
MPI_Type_commit
[2] = {0, sizeof(u_int32_t)};
MPI_Aint f_displacementsint f_lengths[2] = {1, 1};
[2] = { MPI_UINT32_T, tree_type_vector};
MPI_Datatype f_types
;
MPI_Datatype f_view(2, f_lengths, f_displacements, f_types, &f_view);
MPI_Type_create_struct(&f_view);
MPI_Type_commit
for (vector v in vlist) {
long index = 0;
for(int lvl=0; lvl<=grid->maxdepth;++lvl)
(lvl,serial)
foreach_levelif (is_local(cell)) {
#if dimension == 2
.data[index] = (float)val(v.y);
vector_struct.data[index+1] = (float)val(v.x);
vector_struct.data[index+2] = (float)0.;
vector_struct+= 3;
index #elif dimension == 3
.data[index] = (float)val(v.z);
vector_struct.data[index+1] = (float)val(v.y);
vector_struct.data[index+2] = (float)val(v.x);
vector_struct+= 3;
index #endif
}
(fp, offset, f_view,f_view, "native", MPI_INFO_NULL);
MPI_File_set_view(fp, &vector_struct, 1, m_view, MPI_STATUS_IGNORE);
MPI_File_write_all+= vertices*cell_size + sizeof(u_int32_t);
offset }
free(vector_struct.data); vector_struct.data = NULL;
(&m_view);
MPI_Type_free(&f_view);
MPI_Type_free(&tree_type_vector);
MPI_Type_free}
(fp, offset, MPI_BYTE,MPI_BYTE, "native", MPI_INFO_NULL);
MPI_File_set_view
if (pid() == 0)
(sprintf(buffer, "ENDBINARY\n\t</AppendedData>\n</VTKFile>\n"));
Write2File
#if HTG_SPEED_STATS
(MPI_COMM_WORLD);
MPI_Barrierdouble end = MPI_Wtime();
fprintf (stdout,"Write Time Taken: %f ms, Throughput: %.2f MB/s, Stats %.2f%%, Header: %.2f%%\n",\
(end-start)*1000., (double)(offset+strlen(buffer))/(end-start)/(double)sq(1024),\
/(end-start)*100.,t_header/(end-start)*100.);
t_statsfflush(stdout);
#endif
(fp);
MPI_File_sync(MPI_COMM_WORLD);
MPI_Barrier#if defined(_OPENMP)
(num_omp);
omp_set_num_threads#endif
}
#else
void output_htg_data(scalar * list, vector * vlist, FILE * fp)
{
#if defined(_OPENMP)
int num_omp = omp_get_max_threads();
(1);
omp_set_num_threads#endif
unsigned int vertices_local = 0;
unsigned int descBits_local = 0;
unsigned int vertices_local_pL[grid->maxdepth+1]; // pL = per Level
for (int lvl = 0; lvl < grid->maxdepth+1; ++lvl){
[lvl] = 0;
vertices_local_pL(lvl,serial)
foreach_levelif(is_local(cell))
[lvl]++;
vertices_local_pL
+= vertices_local_pL[lvl];
vertices_local }
= vertices_local - vertices_local_pL[grid->maxdepth];
descBits_local
#if HTG_SPEED_STATS
struct timespec start;
if( clock_gettime( CLOCK_REALTIME, &start) == -1 ) {
( "clock gettime" );
perror}
#endif
- Calculate min,max for xml header (prob. not needed?)
#if HTG_SPEED_STATS
struct timespec t_stats_start;
if( clock_gettime( CLOCK_REALTIME, &t_stats_start) == -1 ) {
( "clock gettime" );
perror
}
#endif
double min_val[list_len(list)];
double max_val[list_len(list)];
{
int i = 0;
for (scalar s in list) {
#if HEADER_MIN_MAX_VAL
stats stat = statsf(s);
[i]=stat.min;
min_val[i]=stat.max;
max_val#else
[i]=0.;
min_val[i]=0.;
max_val#endif
++;
i}
}
double min_val_v[vectors_len(vlist)];
double max_val_v[vectors_len(vlist)];
{
int i = 0;
for (vector v in vlist) {
#if HEADER_MIN_MAX_VAL
[i]= HUGE;
min_val_v[i]= -HUGE;
max_val_vforeach_dimension(){
stats stat = statsf(v.x);
[i] = min(stat.min,min_val_v[i]);
min_val_v[i]= max(stat.max,max_val_v[i]);
max_val_v}
#else
[i]= 0.;
min_val_v[i]= 0.;
max_val_v#endif
++;
i}
}
#if HTG_SPEED_STATS
struct timespec t_stats_stop;
if( clock_gettime( CLOCK_REALTIME, &t_stats_stop) == -1 ) {
( "clock gettime" );
perror
}
// to do t_stats_stop
double t_stats = ( t_stats_stop.tv_sec - t_stats_start.tv_sec )
+ (double)( t_stats_stop.tv_nsec - t_stats_start.tv_nsec )
/ (double)1000000000L;
struct timespec t_header_start;
if( clock_gettime( CLOCK_REALTIME, &t_header_start) == -1 ) {
( "clock gettime" );
perror
}
#endif
File Header vtkHyperTreeGrid
//~ WriteHeader(&offset, infoStruct);
int maj_v = VTK_FILE_VERSION/10, min_v = VTK_FILE_VERSION%10;
fprintf(fp,\
"<VTKFile %s version=\"%i.%i\" %s %s>\n",\
"type=\"HyperTreeGrid\"", \
, min_v , \
maj_v"byte_order=\"LittleEndian\" ",\
"header_type=\"UInt32\"");
#if dimension==2
fprintf(fp,\
"\t<HyperTreeGrid BranchFactor=\"2\" TransposedRootIndexing=\"0\" Dimensions=\"%d %d %d\">\n",\
2,2,1);
#elif dimension==3
fprintf(fp,\
"\t<HyperTreeGrid BranchFactor=\"2\" TransposedRootIndexing=\"0\" Dimensions=\"%d %d %d\">\n", 2,2,2);
#endif
fprintf(fp,"\t\t<Grid>\n");
#if dimension==2
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"XCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",Y0, Y0+L0);
fprintf(fp, "\t\t\t\t%g %g",Y0, Y0+L0);
fprintf(fp,"\n\t\t\t</DataArray>\n");
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"YCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",X0, X0+L0);
fprintf(fp, "\t\t\t\t%g %g",X0, X0+L0);
fprintf(fp,"\n\t\t\t</DataArray>\n");
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"ZCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",Z0, Z0+L0);
fprintf(fp, "\t\t\t\t%g %g", 0., 0.);
#elif dimension==3
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"XCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
, Z0+L0);
Z0fprintf(fp, "\t\t\t\t%g %g",Z0, Z0+L0);
fprintf(fp,"\n\t\t\t</DataArray>\n");
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"YCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
, Y0+L0);
Y0fprintf(fp, "\t\t\t\t%g %g",Y0, Y0+L0);
fprintf(fp,"\n\t\t\t</DataArray>\n");
fprintf(fp, "\t\t\t<DataArray type=\"Float64\" Name=\"ZCoordinates\" NumberOfTuples=\"2\" format=\"ascii\" RangeMin=\"%g\" RangeMax=\"%g\">\n",\
, X0+L0);
X0fprintf(fp, "\t\t\t\t%g %g",X0, X0+L0);
#endif
fprintf(fp,"\n\t\t\t</DataArray>\n");
fprintf(fp,"\t\t</Grid>\n");
fprintf(fp,"\t\t<Trees>\n");
// Tree Begin
unsigned int byte_offset = 0;
#if VTK_FILE_VERSION == 10
fprintf(fp,"\t\t\t<Tree Index=\"0\" NumberOfLevels=\"%d\" NumberOfVertices=\"%u\">\n",\
->maxdepth + 1, vertices_local);
grid
// Array Descriptor Bits
fprintf(fp,"\t\t\t\t<DataArray type=\"Bit\" Name=\"Descriptor\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"1\" offset=\"%u\"/>\n",\
, byte_offset);
descBits_local+= (descBits_local/8+1) * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset
// Array NbVerticesByLevel
fprintf(fp,"\t\t\t\t<DataArray type=\"Int64\" Name=\"NbVerticesByLevel\" NumberOfTuples=\"%d\" format=\"ascii\" RangeMin=\"1\" RangeMax=\"%u\" >\n\t\t\t\t\t",\
->maxdepth + 1, vertices_local_pL[grid->maxdepth]);
grid
for (int lvl = 0; lvl <= grid->maxdepth; lvl++) {
fprintf(fp,"%u ", vertices_local_pL[lvl]);
}
fprintf(fp,"\n\t\t\t\t</DataArray>\n");
#endif
#if VTK_FILE_VERSION == 20
// Array Descriptor Bits
fprintf(fp,"\t\t\t\t<DataArray type=\"Bit\" Name=\"Descriptors\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"1\" offset=\"%u\"/>\n",\
, byte_offset);
descBits_local+= (descBits_local/8+1) * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset
fprintf(fp,"\t\t\t\t<DataArray type=\"Int64\" Name=\"NumberOfVerticesPerDepth\" NumberOfTuples=\"%d\" format=\"ascii\" RangeMin=\"1\" RangeMax=\"%u\" >\n\t\t\t\t\t",\
->maxdepth + 1, vertices_local_pL[grid->maxdepth]);
grid
for (int lvl = 0; lvl <= grid->maxdepth; lvl++) {
fprintf(fp,"%u ", vertices_local_pL[lvl]);
}
fprintf(fp,"\n\t\t\t\t</DataArray>\n");
fprintf(fp,"\t\t\t\t<DataArray type=\"Int64\" Name=\"TreeIds\" NumberOfTuples=\"1\" format=\"ascii\" RangeMin=\"0\" RangeMax=\"0\" >\n");
fprintf(fp,"\t\t\t\t\t%i\n",0);
fprintf(fp,"\t\t\t\t</DataArray>\n");
fprintf(fp,"\t\t\t\t<DataArray type=\"UInt32\" Name=\"DepthPerTree\" NumberOfTuples=\"1\" format=\"ascii\" RangeMin=\"%i\" RangeMax=\"%i\" >\n",\
->maxdepth+1, grid->maxdepth+1);
gridfprintf(fp,"\t\t\t\t\t%i\n",grid->maxdepth+1);
fprintf(fp,"\t\t\t\t</DataArray>\n");
fprintf(fp,"\t\t</Trees>\n");
#endif
// Cell Data Begin
fprintf(fp,"\t\t\t\t<CellData>\n");
#if WRITE_HTG_LEVEL
fprintf(fp,"\t\t\t\t\t<DataArray type=\"UInt8\" Name=\"Level\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"0\" RangeMax=\"%i\" offset=\"%u\"/>\n",\
, grid->maxdepth, byte_offset);
vertices_local+= vertices_local * sizeof(u_int8_t) + sizeof(u_int32_t);
byte_offset #endif
{
int i = 0;
for (scalar s in list)
{
fprintf(fp,"\t\t\t\t\t<DataArray type=\"Float32\" Name=\"%s\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"%g\" RangeMax=\"%g\" offset=\"%u\"/>\n",\
.name, vertices_local, min_val[i], max_val[i], byte_offset);
s+= vertices_local * sizeof(float_t) + sizeof(u_int32_t);
byte_offset ++;
i}
}
{
int i = 0;
for (vector v in vlist)
{
char *vname = strtok(v.x.name, ".");
fprintf(fp, "\t\t\t\t\t<DataArray type=\"Float32\" NumberOfComponents=\"%i\" Name=\"%s\" NumberOfTuples=\"%u\" format=\"appended\" RangeMin=\"%g\" RangeMax=\"%g\" offset=\"%u\"/>\n",\
3,vname, vertices_local, min_val_v[i],max_val_v[i],byte_offset);
+= vertices_local * 3 * sizeof(float_t) + sizeof(u_int32_t);
byte_offset ++;
i}
}
fprintf(fp,"\t\t\t\t</CellData>\n");
#if VTK_FILE_VERSION == 10
fprintf(fp,"\t\t\t</Tree>\n\t\t</Trees>\n");
#endif
fprintf(fp,"\t</HyperTreeGrid>\n\t<AppendedData encoding=\"raw\">\n_");
#if HTG_SPEED_STATS
struct timespec t_header_stop;
if( clock_gettime( CLOCK_REALTIME, &t_header_stop) == -1 ) {
( "clock gettime" );
perror}
double t_header = ( t_header_stop.tv_sec - t_header_start.tv_sec )
+ (double)( t_header_stop.tv_nsec - t_header_start.tv_nsec )
/ (double)1000000000L;
#endif
Write Bitmask
int cell_size;
=sizeof(u_int8_t);
cell_size
int vertices_local_corr = ((descBits_local/8)+1)*8;
= vertices_local_corr;
u_int32_t prepend_size (&prepend_size, sizeof(u_int32_t), 1, fp);
fwrite* write_cache = (u_int8_t*)calloc(vertices_local_corr,cell_size);
u_int8_tlong index = 1;
for(int lvl=0; lvl<grid->maxdepth;++lvl){ // Bitmask ist "0" auf grid->maxdepth und muss nicht geschrieben werden
(lvl,serial)
foreach_levelif (is_local(cell)) {
if(is_leaf(cell)){
[index++] = 0; // ascii: 0
write_cache} else {
[index++] = 1; // ascii: 1
write_cache}
}
}
// Byte to bits
for (int i = 0; i < vertices_local_corr/8 ; ++i){
for ( int j = 0; j < 8; ++j ) {
[i] |= write_cache[(8*i+j) + 1] << (7-j);
write_cacheif ((j+1) == 8) {
[i+1] = 0; // clear next byte
write_cache}
}
}
(&write_cache[0], cell_size, vertices_local_corr/8, fp);
fwritefree(write_cache);
= NULL; write_cache
Write LEVEL
#if WRITE_HTG_LEVEL
=sizeof(u_int8_t);
cell_size
= vertices_local * cell_size;
prepend_size (&prepend_size, sizeof(u_int32_t), 1, fp);
fwrite
for(int lvl=0; lvl<=grid->maxdepth;++lvl){
* write_cache = (u_int8_t*) malloc(vertices_local_pL[lvl]*sizeof(u_int8_t));
u_int8_t long index = 0;
(lvl,serial)
foreach_levelif (is_local(cell))
[index++] = (u_int8_t)lvl;
write_cache
(&write_cache[0], cell_size, vertices_local_pL[lvl], fp);
fwritefree(write_cache);
= NULL;
write_cache }
#endif // end WRITE_HTG_LEVEL
for (scalar s in list) {
=sizeof(float_t);
cell_size
= vertices_local * cell_size;
u_int32_t prepend_size (&prepend_size, sizeof(u_int32_t), 1, fp);
fwrite
for(int lvl=0; lvl < grid->maxdepth + 1; ++lvl){
// offset per level
* write_cache = (float_t*) malloc(vertices_local_pL[lvl]*cell_size);
float_t long index = 0;
(lvl,serial)
foreach_levelif (is_local(cell))
[index++] = val(s);
write_cache
(&write_cache[0], cell_size, vertices_local_pL[lvl], fp);
fwritefree(write_cache);
= NULL;
write_cache }
}
for (vector v in vlist) {
= 3*sizeof(float_t);
cell_size
= vertices_local * cell_size;
u_int32_t prepend_size (&prepend_size, sizeof(u_int32_t), 1, fp);
fwrite
for(int lvl=0; lvl<=grid->maxdepth;++lvl){
* write_cache = (float_t*) malloc(vertices_local_pL[lvl]*cell_size);
float_t long index = 0;
(lvl,serial)
foreach_levelif (is_local(cell)) {
#if dimension == 2
[index] = val(v.y);
write_cache[index+1] = val(v.x);
write_cache[index+2] = 0.;
write_cache+= 3;
index #elif dimension == 3
[index] = val(v.z);
write_cache[index+1] = val(v.y);
write_cache[index+2] = val(v.x);
write_cache+= 3;
index #endif
}
(&write_cache[0], cell_size, vertices_local_pL[lvl], fp);
fwritefree(write_cache);
= NULL;
write_cache }
}
fprintf(fp, "ENDBINARY\n\t</AppendedData>\n</VTKFile>\n");
#if HTG_SPEED_STATS
struct timespec stop;
if( clock_gettime( CLOCK_REALTIME, &stop) == -1 ) {
( "clock gettime" );
perror}
long offset = ftell(fp);
double t_file = ( stop.tv_sec - start.tv_sec )
+ (double)( stop.tv_nsec - start.tv_nsec )
/ (double)1000000000L;
fprintf (stdout,"Write Time Taken: %lf ms, Throughput: %.2f MB/s, Stats %.2f%%, Header: %.2f%%\n",\
*1000., (double)offset/t_file/(double)sq(1024),\
t_file/t_file*100.,t_header/t_file*100.);
t_statsfflush(stdout);
#endif
fflush(fp);
#if defined(_OPENMP)
(num_omp);
omp_set_num_threads#endif
}
#endif