Post by laytonjbPost by Craig TierneyI came across this www.gluster.org <http://www.gluster.org>
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to
a file by many processes.
Will it be suitable for HPC applications.
I wouldn't call GlusterFS a parallel filesystem in the same way I would
refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
where complete files are contained on one of multiple servers.
This isn't quite accurate. Depending upon the translators you use, the files
can be stripped across servers. For clusters it is almost always the case that
the files will be stripped.
Post by Craig Tierneystriping, even they say striping for their implementation is bad
(http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
Because of GlusterFS's modular architecture it was easy for them to implement.
They do have MPI-IO support on their roadmap, so maybe they are planning to work around
the issues described in the link above in user space.
Yes, there is a translator that will stripe files. However, see the above comment.
Even they say it isn't a good idea to use it.
I don't see why that for clusters it would always be the case that files will be striped? Are
you implying that clusters means "Large distributed HPC systems that read/write very large files"?
There is implicitly overhead in reconstructing a striped file that will impact performance
(but could be minimal, I haven't tested it). Streaming performance may be better but what
about random IO patterns? If my codes don't do parallel IO, why would I necessarily
add the complexity?
I know Lustre does striping quite well, but not applications require it.
Post by laytonjbPost by Craig TierneyGlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS
replacement. In my minimal testing, performance scales linearly as you add data servers.
Metadata performance is reasonable (by feel, not by actual measurements).
One of the design ideas behind GlusterFS is that it doesn't have a metadata
server. So i'm not sure what you were measuring. It may have been the
metadata performance for the underlying file system rather than GlusterFS.
By metadata performance, I meant IOPS. It doesn't have a dedicated metadata server,
but all servers perform the function. The streaming performance is quite
good, but what I if I need to use NetCDF files, compile code, or use the filesystem
as a large distributed mailserver?
Why I say streaming performance is good, I have been able to get a single server
to push about 300 MB/s. This is a limitation of my storage device, not the
filesystem. I don't know how performs over the IB transport when a faster
disk array is used.
Post by laytonjbI haven't tested it yet, but it has some interesting ideas (all in user-space so
there are no kernel mods to worry about, no metadata server, stackable
translators for tuning performance).
Yes, these features are very nice. I liked that I could get it running on an older
kernel in only a few minutes (non-lustre server supported kernel). So far it is
meeting my needs for a small application. I haven't been using it long, so
I cannot comment on long term stability. When I have some larger storage servers,
I plan to test it further (as well as Lustre).
Craig
Post by laytonjbJeff
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Craig Tierney (craig.tierney at noaa.gov)