Discussion about Garbage Collector

Anything and everything that's related to OGRE or the wider graphics field that doesn't fit into the other forums.
Post Reply
User avatar
BenO
Goblin
Posts: 241
Joined: Mon Apr 18, 2005 5:03 pm

Discussion about Garbage Collector

Post by BenO »

while i was looking for D informations, I found some ideas about garbage collector :

http://www.digitalmars.com/d/garbage.html

what do u think of it ?
Benjamin RIGAUD
Software Engineer
Maleficus
Greenskin
Posts: 116
Joined: Sat Jul 30, 2005 11:11 am
Location: Vancouver, B.C. Canada

Post by Maleficus »

Personally, although I'm a .NET developer, I hate garbage collection. The peace of mind of knowing an object is deleted when I tell it to be deleted is one of the few things I miss from c++ and other unmanaged languages.

Really, when you're talking about scenarios like games that potentially instance or delete thousands of objects per frame, the garbage collector can be a pain. Why? Because it only deletes an object once no references to it exist. So you wind up having to give an object a Disposing flag, and check that before using it. If Disposing is true, you remove the reference. Which is effectively the same as setting a deleted object to NULL and checking for NULL before using it.

So yeah, it's great if you're coding a word processor or something, but with graphics programming you need just as much code to check for deleted objects as you would need in c++. More of a pain than it's worth imho.
User avatar
:wumpus:
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 3067
Joined: Tue Feb 10, 2004 12:53 pm
Location: The Netherlands
x 1

Post by :wumpus: »

I like garbage collecting languages, the biggest problem is its interaction with threading (as also mentioned in that article). All the synchronistion or the 'stop the world' nature makes it really inefficient in such cases. With multicore and threading on the rise I wonder how garbage collection will keep up.

Another drawback is that you don't control the exact ordering in which things get freed; this matters in the case an object holds system resources. For example, it might be important to free all the hardware vertex buffers before the GL rendersystem shuts down.
Because it only deletes an object once no references to it exist.
I don't get this; why would you ever want to delete an object to which references do exist? That would result in dangling pointers and potential hep corruption in the non-gc case.
SurfaceTension
Gnoblar
Posts: 9
Joined: Mon Apr 25, 2005 8:47 pm

Post by SurfaceTension »

:wumpus: wrote:I don't get this; why would you ever want to delete an object to which references do exist? That would result in dangling pointers and potential hep corruption in the non-gc case.
I think he means to say "free this resource now" instead of having to make extra sure you've taken down all of your references. After all, GC can have memory leaks too. You could have a reference you forgot to null out and the memory is never deleted.

I'm for managing my own memory. I really like taking advantage of the destructor to clean up all of my memory, and then cascading to clean up other memory. In a GC language, I have to define this behavior myself, and then I have to remember to call the cleanup for the particlar classes that need it (instead of an unmanaged language, where your guard is always up and you're always cleaning up your memory). Remembering to delete and new, especially when you have manager classes that are responsible for allocation and deallocation, really isn't that hard. Don't like using deletes? Use a smart pointer. They get destroyed automatically during scope cleanup. That behavior is much easier to debug than GC.

Don't get me wrong, GC makes it easy, but I wonder how many leaks I've created when I allocated something and then "forgot about it". It certainly isn't something I'd want in my game code.
Advisor
Halfling
Posts: 42
Joined: Mon Aug 22, 2005 9:18 pm

Post by Advisor »

Could it be possible to delete explicitly by your wish and leave garbage collection at the same time ?
Maleficus
Greenskin
Posts: 116
Joined: Sat Jul 30, 2005 11:11 am
Location: Vancouver, B.C. Canada

Post by Maleficus »

SurfaceTension wrote:
:wumpus: wrote:I don't get this; why would you ever want to delete an object to which references do exist?
I think he means to say "free this resource now" instead of having to make extra sure you've taken down all of your references. After all, GC can have memory leaks too. You could have a reference you forgot to null out and the memory is never deleted.
In .Net the dispose method is mainly meant for freeing any unmanaged resources you may be using. So in the case of the OgreNet wrapper, calling Dispose deletes the pointer to the unmanaged wrapped object, but not the wrapping object. The GC won't actually swoop in and delete the wrapping object until you haven't used it for x amount of time. And using it can simply mean that it still exists in an array somewhere, for example.

Here's an example scenario:

Suppose you have a bunch of spaceships, stored in a .Net collection. Each ship also has a collection containing a list of the ships that are inside it's sensor range. Now suppose one of the ships blows up or something. At the time it blows up, we set it's Disposing flag to true, and remove it from the former collection.

But, a reference to the object still exists the latter collection, so the GC won't do anything yet. Next time our AI code runs, we need to iterate through the collection and remove any ships where Disposing is set to true.

Only when no references to the object remain in your code will it get deleted. In game programming, that's the part that makes GC no easier than manually deleting objects. You forget to remove one reference to an object, and BAM!, memory leak city. At least in c++, once you delete the object and set it to NULL, you'll get a null pointer exception if you try to use it.
Advisor wrote:Could it be possible to delete explicitly by your wish and leave garbage collection at the same time ?
Not to my knowledge. If you use .Net objects, you have to put up with the garbage collector. There's a funciton called Finalize that the GC calls to delete an object, but the compiler won't let you call it (which is lame and annoying).

The closest you can get to not using the GC is that c# does support pointers. But I believe they're only usable with unmanaged types (I could be wrong about that though).
Lorenzo
Gnoblar
Posts: 18
Joined: Sun Aug 07, 2005 5:46 am
Contact:

Post by Lorenzo »

In .Net the dispose method is mainly meant for freeing any unmanaged resources you may be using. So in the case of the OgreNet wrapper, calling Dispose deletes the pointer to the unmanaged wrapped object, but not the wrapping object. The GC won't actually swoop in and delete the wrapping object until you haven't used it for x amount of time
Actually it doesn't work like this. The GC is based on reference counting, this means that each time you request a reference to an object, its reference count is incremented by one. When a reference is set to null or the object is disposed (which happens when the current scope ends), the reference count for that object is decreased by one, when the reference count is 0, the GC may decide to delete the object on the next Collect (in case of C#'s GC). It is up to the GC to decide when to delete the object, but there aren't any memory leaks, since the memory occupied by the object can be used, when the GC is first started, it reserves an amount of memory for allocating new objects, each time you request a reference to an object through Something c = new Something(); the GC checks the size of Something and compares it with the amount of free memory available in the memory pool, if it is not enough, the GC calls Collect to free objects which have a reference count of 0, if it can gets enough memory, it will allocates the new object, if it can't, a new Collect is called to see if some more memory can be freed, at last if there is enough memory, the object is allocated and you get your reference, else an out of memory exception is thrown (actually I simplified a lot the way a GC works, there are a lot of other things going on), so as you can see there aren't memory leaks, there are only possible slowdowns while searching for free memory, but on modern system the initial memory pool is pretty big, so you won't run out of memory for a long time, anyway not all the GCs works in the same way, but the main reason for introducing a GC into an environment is to avoid memory leaks and bugs caused by bad use of pointers (which anyway are the 90% of bugs), anyway a GC is useful if you are managing a LOT of things (envinronments like .NET or Java simply won't work without a GC, there are too much things to take into account), using a GC with a language like C++ is pointless, since C++ was not made to use a GC, there are smart pointers which are a better way to handle memory (and some of them works in a way similar to a gc, for example reference counted smart pointers (SharedPtr<> in ogre's language :))
The closest you can get to not using the GC is that c# does support pointers. But I believe they're only usable with unmanaged types (I could be wrong about that though).
you're almost right, pointers in C# are useful only with unmanaged types (you can use them with Win32 or native API for example), you can have a pointer which points to a C# managed object:

Code: Select all

unsafe void func(byte[] b)
{
    fixed(byte *p = b)
    {
        *p = 4; //whatever you want :)
    }
}

public static void Main(String args[])
{
    byte[] b = new byte[100];
    func(b);
}
but as you can see you have to use the fixed keyword, so the GC won't move the object in memory, and the address will be valid during code execution
of course there's no Object *obj = new Object(); in C#, since there's no delete and as stated before, an address in C# doesn't have a true meaning (so delete obj; may delete your planet instead of your spaceship :)), since the GC may decide to move the objects in memory

anyway, in C# use the GC, in C++ use smart pointers, and everyone will be happy :)

bye!

Lorenzo
Bronski
Gnoblar
Posts: 3
Joined: Tue Aug 02, 2005 11:20 am

Post by Bronski »

I think that reference counted smart pointers are the way to go.

I think their only drawback compared to a more advanced garbage collection scheme is that it cannot handle cyclic references properly.

Why I dislike garbage collection:

- It's much more complicated than reference counted smart pointers for a similar result, so it's kind of a waste of efforts.

- It only free you from the burden of managing memory. For other kind of resources (like if an object that opened a file or a socket, for instance), you have to explicitely free it, or explicitely ask the object to free it.

A friend told me there's a disposable interface in .net, with a dispose method you can call to explicitely ask an object to release its stuff before it's actually deleted.

It defeats the whole purpose of garbage collection if there are things that you have to remember to release. C++ thanks to the RAII pattern doesn't have this problem.

- I've heard horror stories from a friend who write .net code at his work, that sometimes, seemingly innocent code runs like crap and uses tons of heap storage because behind the scenes, a ton of temporary objects are created, and you then have to refactor your code in a way that will lead to less temporary objects creation.

Again, it defeats the purpose of garbage collection since it's supposed to be transparent and free the programmer from certain worries, and instead it creates entirely new, artificial ones.
Lorenzo
Gnoblar
Posts: 18
Joined: Sun Aug 07, 2005 5:46 am
Contact:

Post by Lorenzo »

- It's much more complicated than reference counted smart pointers for a similar result, so it's kind of a waste of efforts.
Well, smart pointers are useful, but as I said before, for a comlex environment like .NET o Java you need garbage collection, and garbage collection isn't just removing the delete keyword from the language
- It only free you from the burden of managing memory. For other kind of resources (like if an object that opened a file or a socket, for instance), you have to explicitely free it, or explicitely ask the object to free it.
well, why the GC should know what you are doing? GC works with memory, not sockets, files, or other kind of things, just implement a Dispose method in your class which deals with those kind of things
A friend told me there's a disposable interface in .net, with a dispose method you can call to explicitely ask an object to release its stuff before it's actually deleted.
there's always the Finalize method (implemented with destructors in managed c++ and C#), but using Dispose is better
It defeats the whole purpose of garbage collection if there are things that you have to remember to release
only for unmanaged resources
I've heard horror stories from a friend who write .net code at his work, that sometimes, seemingly innocent code runs like crap and uses tons of heap storage because behind the scenes, a ton of temporary objects are created, and you then have to refactor your code in a way that will lead to less temporary objects creation
well this might be true for the Java Virtual Machine (an hello world takes up more than 40mb of ram on my system with Java 1.5), but not on .NET, the only objects created in .NET are an AppDomain and an assembly representing your application, plus some other objects but not tons
Again, it defeats the purpose of garbage collection since it's supposed to be transparent and free the programmer from certain worries, and instead it creates entirely new, artificial ones.
garbage collection is trasparent, if you use unamanaged resources, then you're bypassing the garbage collector, is up to you to manage them, not to the garbage collector, of course with really BIG application you may want to have some more control over the garbage collection, there's the Collect method for this purpose
Bronski
Gnoblar
Posts: 3
Joined: Tue Aug 02, 2005 11:20 am

Post by Bronski »

Lorenzo wrote: well, why the GC should know what you are doing? GC works with memory, not sockets, files, or other kind of things, just implement a Dispose method in your class which deals with those kind of things
Yeah, a dispose method. You have to remember to call it, you have to be careful with regards to exceptions.

In a language where objects have a deterministic life duration like C++, things can be much easier, less error prone, and much more readable due to less useless clutter in the code. See http://blogs.msdn.com/hsutter/archive/2 ... 03137.aspx for a comparison of the RAII and Dispose patterns, and see the kind of incantations you have to write where 3 lines of C++ are sufficient, whenever you want to work around the undeterministic nature of garbage collected languages.
there's always the Finalize method (implemented with destructors in managed c++ and C#), but using Dispose is better
Finalize is a method that may be called someday, when the garbage collector feels like it. It doesn't cut it in many cases. There are things you want to be sure doesn't hang around longer than their useful life span.

It defeats the whole purpose of garbage collection if there are things that you have to remember to release
only for unmanaged resources
Please clarify what you mean by unmanaged resources.
I'm talking about cases where an objects have for instance opened a socket. If you don't use the dispose pattern or any other kind of explicit, error-prone way of asking the object to release it, you have to wait an undeterminate amount of time until it's actually released.
well this might be true for the Java Virtual Machine (an hello world takes up more than 40mb of ram on my system with Java 1.5), but not on .NET, the only objects created in .NET are an AppDomain and an assembly representing your application, plus some other objects but not tons
My friend work on industrial test benches. They have to acquire, analyse and store data coming in very fast. And yes, they did have had cases where .net made a mess of temporary objects during the data acquisition loop.
Lorenzo
Gnoblar
Posts: 18
Joined: Sun Aug 07, 2005 5:46 am
Contact:

Post by Lorenzo »

Yeah, a dispose method. You have to remember to call it, you have to be careful with regards to exceptions.
well programming requires some attention, it is not a Click & Create process
In a language where objects have a deterministic life duration like C++, things can be much easier, less error prone, and much more readable due to less useless clutter in the code
and 90% of bugs are caused by wrong use of pointers and people accessing null pointers or freeing them, not taking into account the buffer overflows caused by stupid use of strcpy (99% of them?)
Finalize is a method that may be called someday, when the garbage collector feels like it. It doesn't cut it in many cases. There are things you want to be sure doesn't hang around longer than their useful life span.
if you are so resource hungry, just write a thread that calls GC.Collect() periodically and you're done
Please clarify what you mean by unmanaged resources.
I'm talking about cases where an objects have for instance opened a socket. If you don't use the dispose pattern or any other kind of explicit, error-prone way of asking the object to release it, you have to wait an undeterminate amount of time until it's actually released.
unmanaged resources are object which aren't handled by .NET, such as opening a file with the Win32 API (CreateFile, ReadFile, WriteFile, ecc...), opening sockets with winsock (WSA*), of course if you use unmanaged resources you have to handle them
if you use .NET managed resources, then .NET handles them for you, of course if the managed resource you're using (for example the class Socket) make use of unmanaged resources, then you have to call Dispose for that class or wait for Finalize (which gets called), that's the way .NET works
My friend work on industrial test benches. They have to acquire, analyse and store data coming in very fast. And yes, they did have had cases where .net made a mess of temporary objects during the data acquisition loop.
.NET doesn't create objects which you don't tell it to create, there's no voodoo magic, if you're working with a lot of object in a loop, you may consider using structures instead of classes, which for some occasion are faster (in C# a struct and a class aren't the same thing like in C++)
Bronski
Gnoblar
Posts: 3
Joined: Tue Aug 02, 2005 11:20 am

Post by Bronski »

Lorenzo wrote:
Yeah, a dispose method. You have to remember to call it, you have to be careful with regards to exceptions.
well programming requires some attention, it is not a Click & Create process
By that logic, we should all still be programming in assembler. Why let the compiler do tedious things for you ?

In a language where objects have a deterministic life duration like C++, things can be much easier, less error prone, and much more readable due to less useless clutter in the code
and 90% of bugs are caused by wrong use of pointers and people accessing null pointers or freeing them, not taking into account the buffer overflows caused by stupid use of strcpy (99% of them?)
First off, pointers have nothing to do with it, so it's kinda out of the scope of a discussion about the merits of garbage collection. It happens you can use pointers in C++, and you can use strcpy. A language with deterministic deletion of objects could perfectly exist without pointers.

Besides, strcpy is a thing of the past. The C++ way is to use the string class (as well as the vector class and such from the stl).
There are not many places in C++ where a plain, regular pointer can't be replaced with something safer like a smart pointer or an iterator. Accessing null pointers or freeing them isn't the kind of things that happen if you use the RAII pattern as much as possible, as well as smart pointers and such things.
Actually if I were to nitpick, "freeing a null pointer" is ok as long as you use new/delete. If you keep on using legacy functions like malloc/free/strcpy etc, you're asking for trouble anyway.

if you are so resource hungry, just write a thread that calls GC.Collect() periodically and you're done
So, have a thread running in the background, randomly stealing an unpredictable amount of processing time ? For what gain over refcounted smart-pointers, already ?

if you use .NET managed resources, then .NET handles them for you, of course if the managed resource you're using (for example the class Socket) make use of unmanaged resources, then you have to call Dispose for that class or wait for Finalize (which gets called), that's the way .NET works
That's exactly the problem I have with this. Everyone always says that programming with .net (or other such thing, like java) is easier because you don't have to think about freeing memory, etc.

But then, you have to remember about releasing things like files and sockets, and for some reason, having to do this by hand is ok for a lot of people, even though forgetting to do any of this is just as likely to cause a bug or performance problem as forgetting to free some memory.
davidclifton
Kobold
Posts: 29
Joined: Tue Aug 23, 2005 2:43 am

Post by davidclifton »

I think what it comes down to, from my point of view, is that when it comes to accessing files etc I abstract this sort of behavior away from the rest of my logic, and quite easily. A nifty little method out there called 'readfile' that handles making sure I properly handling disposing of it is certainly not a big deal. On the other hand, I'm making memory all the time in a properly object oriented language.

It is a lot harder to abstract away my memory management, and quite honestly there are very few applications in the whole scheme of things that ever did it well.
tps12
Gnoblar
Posts: 1
Joined: Tue Aug 23, 2005 8:11 pm

Post by tps12 »

Actually it doesn't work like this. The GC is based on reference counting, this means that each time you request a reference to an object, its reference count is incremented by one. When a reference is set to null or the object is disposed (which happens when the current scope ends), the reference count for that object is decreased by one, when the reference count is 0, the GC may decide to delete the object on the next Collect (in case of C#'s GC).
As described in this article, .NET's GC uses a generational mark-and-sweep algorithm, not a reference count.

Also, I don't really understand the "you have to remember to close sockets yourself" objection to garbage collection...that's true of any language. If you have an object that uses a socket in C++, you need to either declare it on the stack or remember to call delete when you're done with it. In C# you either need to declare it in a using() block or call Dispose when you're done with it. What's the diff?
Post Reply