Discussion:
[Lustre-discuss] GlusterFS and Lustre
rishi pathak
2008-04-30 07:14:05 UTC
Permalink
I came across this www.gluster.org
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to a
file by many processes.
Will it be suitable for HPC applications.
--
Regards--
Rishi Pathak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080430/d1eead2a/attachment.html
rishi pathak
2008-04-30 07:14:05 UTC
Permalink
I came across this www.gluster.org
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to a
file by many processes.
Will it be suitable for HPC applications.
--
Regards--
Rishi Pathak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080430/d1eead2a/attachment-0001.html
Craig Tierney
2008-04-30 14:33:38 UTC
Permalink
I came across this www.gluster.org <http://www.gluster.org>
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to
a file by many processes.
Will it be suitable for HPC applications.
I wouldn't call GlusterFS a parallel filesystem in the same way I would
refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
where complete files are contained on one of multiple servers. Although it supports
striping, even they say striping for their implementation is bad
(http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
Because of GlusterFS's modular architecture it was easy for them to implement.
They do have MPI-IO support on their roadmap, so maybe they are planning to work around
the issues described in the link above in user space.

GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS
replacement. In my minimal testing, performance scales linearly as you add data servers.
Metadata performance is reasonable (by feel, not by actual measurements). Some of the
more interesting features that GlusterFS supports include automatic file replication
(AFR), layered performance translators for both client and server side, ability to support
heterogeneous storage servers, and is really easy to setup and maintain.

I don't have a Lustre setup ready to make any apples to apples comparisons though.
However, I believe that the two products fit two different needs. Also, file
systems are hard and take a long time to stabilize. Lustre has put in its time,
and we are now seeing the benefits. GlusterFS is less mature.

Note, comments above are from some basic testing that I have done. I am
not a GlusterFS developer.

Craig
--
Regards--
Rishi Pathak
------------------------------------------------------------------------
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Craig Tierney (craig.tierney at noaa.gov)
laytonjb
2008-04-30 14:50:22 UTC
Permalink
Post by Craig Tierney
I came across this www.gluster.org <http://www.gluster.org>
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to
a file by many processes.
Will it be suitable for HPC applications.
I wouldn't call GlusterFS a parallel filesystem in the same way I would
refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
where complete files are contained on one of multiple servers.
This isn't quite accurate. Depending upon the translators you use, the files
can be stripped across servers. For clusters it is almost always the case that
the files will be stripped.
Post by Craig Tierney
striping, even they say striping for their implementation is bad
(http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
Because of GlusterFS's modular architecture it was easy for them to implement.
They do have MPI-IO support on their roadmap, so maybe they are planning to work around
the issues described in the link above in user space.
GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS
replacement. In my minimal testing, performance scales linearly as you add data servers.
Metadata performance is reasonable (by feel, not by actual measurements).
One of the design ideas behind GlusterFS is that it doesn't have a metadata
server. So i'm not sure what you were measuring. It may have been the
metadata performance for the underlying file system rather than GlusterFS.

I haven't tested it yet, but it has some interesting ideas (all in user-space so
there are no kernel mods to worry about, no metadata server, stackable
translators for tuning performance).

Jeff
Craig Tierney
2008-04-30 16:00:27 UTC
Permalink
Post by laytonjb
Post by Craig Tierney
I came across this www.gluster.org <http://www.gluster.org>
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to
a file by many processes.
Will it be suitable for HPC applications.
I wouldn't call GlusterFS a parallel filesystem in the same way I would
refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
where complete files are contained on one of multiple servers.
This isn't quite accurate. Depending upon the translators you use, the files
can be stripped across servers. For clusters it is almost always the case that
the files will be stripped.
Post by Craig Tierney
striping, even they say striping for their implementation is bad
(http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
Because of GlusterFS's modular architecture it was easy for them to implement.
They do have MPI-IO support on their roadmap, so maybe they are planning to work around
the issues described in the link above in user space.
Yes, there is a translator that will stripe files. However, see the above comment.
Even they say it isn't a good idea to use it.

I don't see why that for clusters it would always be the case that files will be striped? Are
you implying that clusters means "Large distributed HPC systems that read/write very large files"?
There is implicitly overhead in reconstructing a striped file that will impact performance
(but could be minimal, I haven't tested it). Streaming performance may be better but what
about random IO patterns? If my codes don't do parallel IO, why would I necessarily
add the complexity?

I know Lustre does striping quite well, but not applications require it.
Post by laytonjb
Post by Craig Tierney
GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS
replacement. In my minimal testing, performance scales linearly as you add data servers.
Metadata performance is reasonable (by feel, not by actual measurements).
One of the design ideas behind GlusterFS is that it doesn't have a metadata
server. So i'm not sure what you were measuring. It may have been the
metadata performance for the underlying file system rather than GlusterFS.
By metadata performance, I meant IOPS. It doesn't have a dedicated metadata server,
but all servers perform the function. The streaming performance is quite
good, but what I if I need to use NetCDF files, compile code, or use the filesystem
as a large distributed mailserver?

Why I say streaming performance is good, I have been able to get a single server
to push about 300 MB/s. This is a limitation of my storage device, not the
filesystem. I don't know how performs over the IB transport when a faster
disk array is used.
Post by laytonjb
I haven't tested it yet, but it has some interesting ideas (all in user-space so
there are no kernel mods to worry about, no metadata server, stackable
translators for tuning performance).
Yes, these features are very nice. I liked that I could get it running on an older
kernel in only a few minutes (non-lustre server supported kernel). So far it is
meeting my needs for a small application. I haven't been using it long, so
I cannot comment on long term stability. When I have some larger storage servers,
I plan to test it further (as well as Lustre).

Craig
Post by laytonjb
Jeff
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Craig Tierney (craig.tierney at noaa.gov)
laytonjb
2008-04-30 16:15:59 UTC
Permalink
Post by Craig Tierney
Post by laytonjb
Post by Craig Tierney
I came across this www.gluster.org <http://www.gluster.org>
Has any one tried it .
Is it a true parallel file system allowing concurrent read and write to
a file by many processes.
Will it be suitable for HPC applications.
I wouldn't call GlusterFS a parallel filesystem in the same way I would
refer to Lustre or PVFS. GlusterFS is a distributed filesystem,
where complete files are contained on one of multiple servers.
This isn't quite accurate. Depending upon the translators you use, the files
can be stripped across servers. For clusters it is almost always the case that
the files will be stripped.
Post by Craig Tierney
striping, even they say striping for their implementation is bad
Hmm... The last time I talked to AB he suggested using striping for better
performance. But as you say below, it depends upon the strip size and
other translators in use (I've seen that drive performance).
Post by Craig Tierney
Post by laytonjb
Post by Craig Tierney
(http://www.gluster.org/docs/index.php/GlusterFS_FAQ#Why_is_striping_bad.3F).
Because of GlusterFS's modular architecture it was easy for them to implement.
They do have MPI-IO support on their roadmap, so maybe they are planning to work around
the issues described in the link above in user space.
Yes, there is a translator that will stripe files. However, see the above comment.
Even they say it isn't a good idea to use it.
I don't see why that for clusters it would always be the case that files will be striped? Are
you implying that clusters means "Large distributed HPC systems that read/write very large files"?
I like the idea of striped files from the perspective if that I lose the server
where the file is located, I've lost access to the file until it's restored. I can
mirror the file but that's wasting space.

But, as you point out, it depends upon the application(s). (I think I'll get a
tatoo that says that :) ).
Post by Craig Tierney
There is implicitly overhead in reconstructing a striped file that will impact performance
(but could be minimal, I haven't tested it).
Yep - good comment. I haven't tested the reconstruction either manual or AFR.

Streaming performance may be better but what
Post by Craig Tierney
about random IO patterns? If my codes don't do parallel IO, why would I necessarily
add the complexity?
I know Lustre does striping quite well, but not applications require it.
Post by laytonjb
Post by Craig Tierney
GlusterFS is much more like Ibrix or Netapp/GX than Lustre. It seems best as a distributed NFS
replacement. In my minimal testing, performance scales linearly as you add data servers.
Metadata performance is reasonable (by feel, not by actual measurements).
One of the design ideas behind GlusterFS is that it doesn't have a metadata
server. So i'm not sure what you were measuring. It may have been the
metadata performance for the underlying file system rather than GlusterFS.
By metadata performance, I meant IOPS. It doesn't have a dedicated metadata server,
but all servers perform the function. The streaming performance is quite
good, but what I if I need to use NetCDF files, compile code, or use the filesystem
as a large distributed mailserver?
Why I say streaming performance is good, I have been able to get a single server
to push about 300 MB/s. This is a limitation of my storage device, not the
filesystem. I don't know how performs over the IB transport when a faster
disk array is used.
Post by laytonjb
I haven't tested it yet, but it has some interesting ideas (all in user-space so
there are no kernel mods to worry about, no metadata server, stackable
translators for tuning performance).
Yes, these features are very nice. I liked that I could get it running on an older
kernel in only a few minutes (non-lustre server supported kernel). So far it is
meeting my needs for a small application. I haven't been using it long, so
I cannot comment on long term stability. When I have some larger storage servers,
I plan to test it further (as well as Lustre).
Jeff

Loading...