Talk:Depth-first search

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Is this even relevant?[edit]

When talking about the iterative approach to dfs, the article mentions one of the differences with bfs is that:


"it delays checking whether a vertex has been discovered until the vertex is popped from the stack rather than making this check before adding the vertex"


That's a moot point, as dfs could have just as easily written with the discovered check happening before adding in the stack. — Preceding unsigned comment added by 24.56.255.3 (talk) 11:57, 10 June 2019 (UTC)[reply]

No, it couldn't have been written like that. Consider the graph {1: (2, 3), 2: (), 3: (4, 2), 4: ()} (Python dictionary notation, mapping vertices to their immediate successors). If you do the discovered check before pushing, i.e. you write the algorithm as

procedure DFS_iterative(G, v) is
    let S be a stack
    S.push(v)
    while S is not empty do
        v = S.pop()
        for all edges from v to w in G.adjacentEdges(v) do
            if w is not labelled as discovered then
                label w as discovered
                S.push(w)

this gives you 1, 3, 4, 2 and not 1, 3, 2, 4, since 2 is labelled as discovered after 1 is popped, and hence it doesn't get pushed after 3 is popped. The underlying issue is that 2 occurs in two places in the graph, as a child of 1 and as a child of 3, and the place where it's discovered first is later in the depth-first order.

For breadth-first search it doesn't affect the correctness of the algorithm whether you do the check before enqueueing or after dequeueing. But for depth-first search, it does matter. The paragraph below the part you quoted tries to point this out, maybe somewhat unclearly. It could be improved, perhaps. Stickinsect2 (talk) 16:13, 1 January 2022 (UTC)[reply]

Weird Example[edit]

Another more basic example should be given to start with, without the link from F to E. Surely it would be more intuitive to put the letters in a different order Something like

 
     A
  B  C  D
E F  G

Or maybe the order that the search would go?

     A
  B  E  G
C D  F

Opticyclic 16:33, 21 April 2006 (UTC)[reply]


I think the psuedocode give in incorrect-- it enters an infinite loop! the node must be marked as visited BEFORE recursively calling dfs on adjacent nodes. pleaes confirm this! Yonestar 17:26, 20 April 2006 (UTC)[reply]


What's with the "ie where's the recursion?" DFS doesn't require recursion... no algorithm does. The stack makes it so you don't have to recurse (that's all recursion really does for you anyway, is add to a stack). If lack of recursion is the only "problem" with the algorithm, then there is no problem.


Changed from "DFS is optimal" to "DFS is not optimal". Please Confirm.

I disagree. Any algorithm which examines every vertex and every edge must be at least overall, which DFS is. Therefore, DFS is asymptotically optimal. Did you have a different idea of optimality in mind? bbi5291 (talk) 16:55, 15 November 2009 (UTC)[reply]
Actually, I guess it depends on the application. If we're searching for a path, DFS is obviously not optimal in many cases, especially if the graph is infinite. But it does give optimal solutions to many important problems such as finding strongly connected components and biconnected components. bbi5291 (talk) 17:18, 15 November 2009 (UTC)[reply]

can you show me the program in c for dfs


Whoever "corrected" the page to put E on the end of the path, please do not un-correct it again. DFS will follow the link from F to E before getting to E from A. The apparent tree view does not reflect the actual DFS tree, and I chose it that way deliberately for didactic purposes. Jerf


Can you explain what all non-negative path costs is refering to ?


DFS's can be preformed for graphs in general as well, not just Trees. Though if you're thinking specifically of the technique as used in state-space search's I guess not. Perhaps, these two concepts should be seperated? Deepak 01:39, 17 Jan 2004 (UTC)

They're awfully closely related... I've tried to improve the situation in my last edit, but there still needs to be some more seperation of "depth-first traversal" and "depth-first search". Jerf


Why does the page say that DFS is a method for traversing trees, and then goes on to give an example of a general graph? (as it clearly has a cycle it is not a tree). The page should either state that it is a method for traversing graphs, or change the example to a tree.


The first example should be one of a plain hierarchical tree instead of a graph. Simple examples first more complex later.

The algorithm given in pseudocode section is wrong?[edit]

Well, headline explains it all. It is not depth first for a general graph because of the Color(grey) thing.

I think the color(grey) thing shouldn't be there. and, there should be a color check just after the pop, to skip the node or not.

also, for generalization of the alg. to graphs, there should be an iteration over the nodes above the while loop. I'll change the pseudocode, if anyone has objections, we can revert it.

Recursive solution?[edit]

Why do we have TWO solutions, both of them iterative, creatings stacks, etc? It seems that in general, iterative approaches that have to heavily-manipulate stacks are well-suited to recursive approaches (at least - it is conceptually easier) - and that is certainly true here. I'm definitely going to replace one of these algorithms with a recursive solution. I'd prefer to replace the second, it looks less intuitive. (Remember we're not here to give the best-optimised code, but to teach how it works - therefore the conceptually easier code is always the best on wikipedia). —EatMyShortz 11:46, 28 February 2006 (UTC)[reply]

Pseudocode!=Python[edit]

As much as I love python, it shouldn't be marked as pseudocode. The second one was better, but still not pseudocode.

That's true. I changed it. —Preceding unsigned comment added by Srrrgei (talkcontribs) 20:27, 3 May 2009 (UTC)[reply]

Why is Python better suited than pseudocode in this article? I would prefer the pseudocode. —Preceding unsigned comment added by 38.99.165.166 (talk) 19:53, 6 May 2009 (UTC)[reply]
Some people are under the impression that Python is easy and anyone can learn it. Unfortunately, this isn't the case and is detrimental to anyone wishing to implement the algorithm in another language. Pseudo-code is preferred because it abstracts away the nasty details of a particular language.--OMouse (talk) 14:02, 8 December 2009 (UTC)[reply]
True. Can someone change it to pseudocode? --Robin (talk) 14:17, 8 December 2009 (UTC)[reply]
False. the code is still correct. And if "some people are under the impression that Python is easy" than the example is clearly a good thing for them, right? or are you against people that think python is easy? if you miss the pseudo code, add one. i for one would welcome examples in every single language. Removing correct information from wikipedia page is vandalism in my humble opinion. Here's the link to everyone how misses the examples http://en.wikipedia.org/w/index.php?title=Depth-first_search&oldid=351536941 (NOTE: it's very hard to find a good example of non-recursion example in python outside of the history in this article). create a new page for examples if you want, but do not trash contributions. Gcbwiki (talk) 04:11, 10 January 2011 (UTC)[reply]
I disagree that we should have examples in multiple languages. Wikipedia is not a code repository. One reason for using pseudocode rather than Python here is that (compared to most other languages) Python has or had unusually low limits on recursion depth that make a recursive version of DFS a bad idea, so you have to fake it with a stack instead. Pseudocode can just recurse and be simpler because of it. —David Eppstein (talk) 04:12, 10 January 2011 (UTC)[reply]

Edges[edit]

The four tree edge types

The different classifications of edges are briefly discussed here, but should perhaps be moved to a new article. Consider:

  • Tree edge: an edge (u, v) where v was discovered by exploring edge (u, v);
  • Back edge: an edge (u, v) where u is connected to one of it's ancestors, v;
  • Forward edge: a non-tree edge (u, v) that connects a vertex u to a descendant v;
  • Cross edge: any other edge where one vertex is not an ancestor of the other.

I think this distinction should be made in the interest of better explaining the ability of detecting cycles in a graph using topological sorting. --Stimpy 20:10, 17 December 2006 (UTC)[reply]

By the way, cycle detection is not the same as topological sorting, and I'm not convinced it belongs in the topological sorting article at all any more than it would belong in articles about other algorithmic problems on DAGs. If it is there, it should be as a brief sentence explaining why the DFS based topological sort algorithm is correct rather than as a separate subsection. —David Eppstein 21:25, 17 December 2006 (UTC)[reply]
I'm sorry, upon reviewing my comment I understand that I didn't make clear what I was suggesting. I emphasize that (un)directional edges should be moved to a separate article and that both DFS article and the topological sorting article point to it, hence bridging the 'gap' in understanding how topological sort can be used – together with the principles of the four edge classifications – to find cycles in a depth-first search tree in time. In none of the articles on the subject is this relationship properly explained, in my opinion. —Stimpy 22:11, 21 December 2006 (UTC)[reply]

reverse postordering[edit]

Either the "reverse postordering" section has an error, or I am confused. Since "reverse postordering is the reverse of a postordering, i.e. a list of the vertices in the opposite order of their last visit", why is D not first in reverse postordering?

TJA —Preceding unsigned comment added by 68.162.149.237 (talk) 17:10, 25 March 2008 (UTC)[reply]

The preordering is ABDC or ACDB. I think the postordering is BDCA or CDBA. The reverse postordering looks to be a DFS to a bottom vertex, and then a DFS to all of the possible vertices. Although, I'm still not sure if "reverse postordering" is something. "Reverse postordering" just seems like a reverse DFS from a root vertex. Somebody please correct me. — Preceding unsigned comment added by 24.21.136.68 (talk) 23:01, 10 February 2013 (UTC)[reply]

A, B, D, F, E?[edit]

Why is the cycle in the depth-first graph "A, B, D, F, E"? I don't understand its assumptions. It seems to me that unless you're going to put some sort of directionality on the edges, you'll get something like A, B, D, B, D...

You can't even get from D to F without revisiting B. Sure, B is a backtrack and you don't usually include those in the search order, but if this search was keeping track of what was previously in the path, it wouldn't get into a cycle in the first place. It would take some sort of memory to go from B to D the first time and B to F the second. This strikes me as an error. rspeer / ɹəədsɹ 06:49, 28 August 2008 (UTC)[reply]

Agh, it gets worse. The section on iterative deepening seems to make the same unknown assumptions.

Memory-less depth first search on an undirected graph is quite silly and nobody ever does it. Whoever wrote the example seems to have assumed that an edge is never crossed twice in opposite directions (but can be in the same direction). I may replace this example with a directed graph to make it make some sort of sense. rspeer / ɹəədsɹ 15:56, 28 August 2008 (UTC)[reply]

No proof?[edit]

I just noticed that there is no proof of the algorithm running time (big O) listed? I thought this was standard for graph algorithms?129.59.172.24 (talk) 22:55, 10 February 2009 (UTC)[reply]

I guess it's omitted here because it's pretty obvious - each node is visited once, and each edge is traversed twice. It's worth spelling out, but I'm not quite sure where to put it or how to formalize it. Dcoetzee 23:15, 10 February 2009 (UTC)[reply]

Worst Case Space Complexity[edit]

Shouldn't DFS have to memorize branching_factor * depth nodes? Consider it visiting a node N expanding two non-goal leaf nodes A and B, after backtracking from A, information about N's remaining children is necessary to avoid running into A again, isn't it? And as this holds for every node visited, this would require to memorize O(branching_factor * search_depth) nodes, doesn't it? —Preceding unsigned comment added by 78.49.51.103 (talk) 17:44, 19 April 2009 (UTC)[reply]

In theoretical computer science, DFS remembers which nodes it's already seen, and therefore takes as much space as BFS. In the AI search literature, which is where the space comparison and the analysis in terms of branching factor comes from, DFS is used without remembering already-seen nodes, and so it needs only enough memory to keep track of the call stack, but in this setting it should be used in conjunction with iterative deepening to control the size of this stack. Probably the differences between TCS and AI usage of DFS are not made sufficiently clear in the article. —David Eppstein (talk) 18:01, 19 April 2009 (UTC)[reply]
No. In the canonical DFS algorithm each level of the "backtracking" stack formally contains two pieces of information: 1) the vertex itself, and 2) the outbound edge, which we traversed when we left the vertex. (Note that in some circumstances the second piece of information can be "derived" indirectly and thus does not have to be physically stored). Every time we backtrack to some previously visited vertex, we use that second piece of information to continue to explore the adjacency of that vertex at the point where we left off. That all there is to it. No need to store all previously visited neighbors. For this reason space complexity of DFS is O(depth). No need to involve the branching factor. 173.11.122.193 (talk) 17:31, 12 March 2014 (UTC)[reply]

Worst Case Space Complexity Contradiction[edit]

From the article:

In these applications it also uses space O(|V| + |E|) in the worst case to 
store the stack of vertices on the current search path as well as the set of already-visited vertices.

From the box:

Worst case space complexity: O(h) where h is the length of the longest simple path in the graph.

These statements are contradictory. I think the box needs to be changed to reflect O(|V| + |E|). —Preceding unsigned comment added by 12.25.230.211 (talk) 21:15, 13 August 2009 (UTC)[reply]

The former is correct for the non-recursive implementation, the latter is correct for the recursive implementation. I'll see if I can make this clear in the article. bbi5291 (talk) 17:04, 15 November 2009 (UTC)[reply]

You didn't read the section just above here in the talk, right? The O(|V|+|E|) style analysis is for applications of DFS in theoretical computer science, where one is given a whole explicit graph in advance, one must search the whole graph, and one uses a data structure of already-visited vertices to avoid repetitions. In particular this data structure means that the space used by the stack is only a small part of the total space so there is no point in analyzing it separately. The O(h) style analysis is for applications in AI, where one typically has an implicit graph, possibly infinite, and one does not eliminate repeated nodes. It has nothing to do with whether the implementation is recursive or not. —David Eppstein (talk) 19:46, 15 November 2009 (UTC)[reply]

Optimality[edit]

I suggest we remove the "optimal" entry from the infobox. Its interpretation is confusing. For an algorithm which solves 1 problem, it is clear what optimal means, it means that the best lower bound for the problem matches the upper bound provided by the algorithm. It doesn't mean anything to say an algorithm is optimal unless you refer to the problem. Since DFS (and BFS) solve so many graph theoretic problems, its not clear what optimal means. Which problem's lower bound is being matched by DFS (or BFS)? (On the other hand, sorting is the problem solved by bubble sort, and it is entirely clear what it means for bubble sort to be non-optimal.) --Robin (talk) 23:48, 16 November 2009 (UTC)[reply]

I agree that it is confusing and that its prominence at the top of the DFS article lends undue weight to the application of DFS to finding paths in graphs, when it has many other uses. I would be happy to see it removed. Even for a single-purpose algorithm, it is not clear whether the optimality refers to the quality of the solution it finds (what I believe to be the intention here) or the time it takes to find it. —David Eppstein (talk) 23:52, 16 November 2009 (UTC)[reply]
Yes, that's true. I had decision problems in mind, where there is no notion of quality of solution. An algorithm is either optimal for the given problem or not. But since the problems solved by DFS/BFS have many correct answers, the quality of the solution is important. So, does anyone object to the entry on optimality being removed from the infobox? --Robin (talk) 00:47, 17 November 2009 (UTC)[reply]
I think the specific intent of this field is "does this algorithm find shortest paths"? E.g. see the "yes (for unweighted graphs)" entry in breadth-first search. —David Eppstein (talk) 04:03, 17 November 2009 (UTC)[reply]
I do find the BFS optimality note weird too. There are excellent shortest path finding algorithms available, why would one use an algorithm that isn't meant to be one? I think the "optimal" entry in both articles will confuse the average reader, who probably came here to use BFS/DFS for some particular application (like checking connectivity or topological sorting). Perhaps optimality can be discussed explicitly in the article somewhere, explaining whether it is optimal for some applications mentioned in the article. Keeping it in the infobox does seem to give it WP:undue weight. (As opposed to the algorithm's time and space usage, which are really important theoretically and practically.) --Robin (talk) 04:24, 17 November 2009 (UTC)[reply]
I guess the question should be, do we rip out the optimality field from the infobox altogether, or do we just fix the infobox template so that omitting that field means it isn't displayed and then leave it blank in this article (and I guess BFS). There are algorithms such as IDA* that really only are for finding paths, but even there the meaning of "optimal: yes" in a box near the top of the page is highly unclear. —David Eppstein (talk) 04:29, 17 November 2009 (UTC)[reply]
If you omit the field, it isn't displayed. However, I went through the list of pages on which this infobox appears, and the situation is quite sad. Apparently, Merge sort is optimal, but Heapsort is "never". Also, Radix sort is "exactly correct". Personally, I'm opposed to calling an O(n log n) sorting algorithm optimal, because the Ω(n log n) lower bound is proved in a different model of computation. I believe that the field "optimal" should just be removed from the template. I'm going to suggest this at the template's talk page Template_talk:Infobox_Algorithm. --Robin (talk) 04:50, 17 November 2009 (UTC)[reply]

Example explosion[edit]

This article needs pseudocode, not a langauage war. Would anyone object to outright removal of all of these language-specific examples? I don't think they add anything. I wouldn't be opposed to moving them to Wikibooks and linking; it is useful to have a reference implementation in your language of choice, but the present state of things obscures the point of this page, which, believe it or not, is to explain depth-first search. —Ben FrantzDale (talk) 12:51, 11 May 2010 (UTC)[reply]

I agree. —David Eppstein (talk) 14:37, 11 May 2010 (UTC)[reply]
I've just seen this article (after reading today's xkcd on the subject). I agree that the many coding examples overbalance the article and are unnecessary. AndrewWTaylor (talk) 15:47, 2 July 2010 (UTC)[reply]
This thread here at the end when another was already going at the top just adds to the fact that no one here is interested in the content, but just in adding more fuel to the language war. good content is good content. it's never too much if it's not interfering with the summary and main topics. — Preceding unsigned comment added by Gcbwiki (talkcontribs) 04:12, 10 January 2011 (UTC)[reply]

"introduction may be too long"?[edit]

How is a two sentence lede section "too long"? Justin W Smith talk/stalk 14:36, 3 July 2010 (UTC)[reply]

I boldly removed the "lead too long" template. Justin W Smith talk/stalk 15:01, 3 July 2010 (UTC)[reply]

Is iterative solution correct?[edit]

Is line 24 of the iterative solution correct? Why is the pop required at this point? — Preceding unsigned comment added by 202.81.18.30 (talk) 06:23, 1 August 2013 (UTC)[reply]

In recursive pseudo-code, back edge detection is wrong[edit]

A back edge found only if w is discovered, not explored. The iterative version gets this right. The recursive has a bug. It will report cycles on many simple DAGs where both a node an its successor have an edge to the same third node. — Preceding unsigned comment added by 69.181.252.29 (talkcontribs)

The code tests if an edge is "explored" but never marks edges as explored. If "explored" means "labelled", then it seems to be ok only for undirected graphs. The non-recursive version (which is not solving the same problem, see the "return" in there) also appears to be for undirected graphs only. Someone with more time on their hands than I do should sort this out properly. As a general principle, even though DFS can classify the edges and this is useful in some applications, the edge classification is not required for the search and the procedure should not be written to rely on it. McKay (talk) 05:15, 10 September 2013 (UTC)[reply]

New iterative pseudocode[edit]

The new iterative pseudocode requires a stack of quadratic size in the worst case since a vertex can be pushed multiple times. This can be useful in some applications, but as a skeleton for the basic DFS process it is wasteful of both time and space. How is this justified? McKay (talk) 05:02, 20 December 2013 (UTC)[reply]

The stack size is linear in the number of edges in the graph, not quadratic, for exactly the same reason that depth first and breadth first take time linear in the number of edges in the graph rather than quadratic in the number of vertices despite having nested loops. Also it's justified because the previous code was (besides being incredibly ugly and convoluted) either wrong or too slow (depending on how you interpreted the "continue" statement). Also it's justified by being the version of the algorithm described in an actual reliable source, unlike the previous code. It is possible to do an iterative DFS that uses only space linear in the number of vertices, but then the stack elements should be iterators (or pointers into the adjacency list) rather than vertices. See e.g. Sedgewick's "Algorithms in Java" or my recent blog post http://11011110.livejournal.com/279880.html. —David Eppstein (talk) 05:08, 20 December 2013 (UTC)[reply]
@David Eppstein:, this requires more elaboration in the article. The way I had learned DFS (from AIMA, primarily, and Skiena indeed describes the same thing but then gives recursive pseudocode) is as "BFS with the queue replaced by a stack". Since this is, as you state in your blog post, a valid DFS algorithm for some situations, I think this page should cover both algorithms and the differences between them.
In any case, I just checked my 2003 edition of AIMA. It gives generic pseudocode for "Tree-Search", then adds:
Depth-first search always expands the deepest node in the current fringe of the search tree. ... This strategy can be implemented by TREE-SEARCH using with a last-in-first-out (LIFO) queue, also known as a stack.
(p. 75.) The "fringe" is what other AI textbooks call the "open set" or agenda.
We can modify the general TREE-SEARCH algorithm to include a data structure called the closed list, which stores every expanded node.
(p. 82.) There's no mention of a need to adapt DFS; the main remark about DFS here is that it no longer takes time linear in the depth of the search tree. QVVERTYVS (hm?) 19:38, 10 May 2015 (UTC)[reply]
Ah, and here's one more thing to add to the confusion: according to the current text in this page, DFS "delays checking whether a vertex has been discovered until the vertex is popped from the stack rather than making this check before pushing the vertex". But this is exactly what the GRAPH-SEARCH (including BFS) algorithm from AIMA is doing, so this presentation is confusing for readers who know DFS/BFS from that textbook. QVVERTYVS (hm?) 19:52, 10 May 2015 (UTC)[reply]
In artificial intelligence usage, depth first and breadth first searches are generally performed on trees, not graphs (the trees of possible search paths in finite or infinite state spaces). In tree-search versions of these algorithms, there is no "visited" list (whenever you expand a node, all of its children are automatically not already visited) and the distinction in question does not exist. The issue comes up when you are searching a graph rather than a tree, using a visited list to avoid processing each vertex more than once. I would guess that is the reason that your AI sources don't notice the problem. —David Eppstein (talk) 19:54, 10 May 2015 (UTC)[reply]
I think you misunderstand my point. AIMA presents algorithms both for trees and for graphs. But what Russell and Norvig do is generalize the "true DFS", rather than BFS, so their BFS is "DFS with a queue instead of a stack" rather than the other way around. So the current statement of the difference in the article is confusing for AIMA readers, since it presumes a particular version of BFS. QVVERTYVS (hm?) 20:10, 10 May 2015 (UTC)[reply]
They use a version of breadth-first search in which the same node can be on the queue multiple times, and is only checked for having been visited when it is pulled from the queue? That seems unusual to me. It is certainly not the version of BFS presented in our breadth-first search article. Changing the queue for a stack in the pseudocode presented there would not give DFS, and I think that confusion is more likely than confusion among readers who happen to have read some specific source. —David Eppstein (talk) 20:18, 10 May 2015 (UTC)[reply]
That is exactly what they do (and why a BFS following their implementation is a memory hog). But maybe you're right. QVVERTYVS (hm?) 20:38, 10 May 2015 (UTC)[reply]
It is not quadratic, it is linear in the number of edges. But this is still very wrong. The "non-recursive implementation of DFS" given in this Wiki entry is "fake DFS" or "pseudo-DFS". It is not a true DFS. The posted pseudo-DFS algorithm produces the DFS-like vertex discovery sequence, but that where its similarity with DFS ends.
In canonical DFS algorithm stack depth is limited to the length of the longest DFS path in the graph. If you look at the recursive implementation of DFS given in the Wiki entry (which is correct), you will see that it satisfies this property. The non-recursive implementation does not satisfy it. The simplest example that would illustrate the issue would be a N-star-graph with DFS beginning at the central vertex. The proper DFS algorithm will traverse this graph never exceeding stack depth of 1. The algorithm given in this entry will instantly reach stack depth of N, which is already a direct violation of fundamental properties of DFS.
If you want to create a proper non-recursive DFS implementation, one way to do it would be to directly translate the recursive implementation into the non-recursive one. For that at each level of the stack you have to memorize the current vertex and the current outbound edge. At each level of recursion you have to continue iterating through the outbound edges stating from the previously memorized one. That way you will obtain the valid non-recursive DFS implementation.
What we have now is an abomination.

173.11.122.193 (talk) 17:22, 12 March 2014 (UTC)[reply]

Uh, why is this using intentionally different algorithms for recursive vs iterative pseudocode examples? It's just adding a different dimension of difference in addition to whether it's explicitly recursive or not. It is trivial to convert either into the other: to turn the recursive one into the iterative one, put the visited check at the very beginning (around everything), and remove the per-edge visited check. To go the other way, move the visited check in the iterative algorithm away from the popped node and into the edge loop. There is absolutely no reason for things to be different just because you're using an explicit stack instead of an implicit one. Heck, you could make the stack explicit in the recursive example by replacing "recursively call DFS(G,w)" with "stack.push(w); tailcall(DFS)" (or "goto top" if you prefer). What's there now is complicating things by changing the algorithm at the same time as it's changing where the stack is. I would change it, but then I'd have to look up a reference. Sphink (talk) 20:29, 18 November 2015 (UTC)[reply]
I have deleted the broken iterative pseudo-code which violates the worst-case space bound of O(|V|) given by the article. This algorithm was not acceptable. To give an analogy, the presence of that algorithm in this page would be just as bad as an O(2^N) sorting algorithm presented as a variant of O(N log N) heapsort. If we want there to be an iterative DFS pseudocode, it should be one which respects the same worst-case space bound as the recursive version. Rbarreira (talk) 18:20, 24 June 2017 (UTC)[reply]
So find a source for such a version and use that. The pseudo-code is not broken (it generates a correct DFS traversal in linear time), it is properly sourced to a textbook, and its additional space usage is clearly described. —David Eppstein (talk) 18:27, 24 June 2017 (UTC)[reply]
The space usage violates the very property claimed by the article. It is not a correct version of DFS. Just because there might be a textbook with a broken claim, does not mean we should accept an algorithm that contradicts the very article it's presented in. Rbarreira (talk) 18:32, 24 June 2017 (UTC)[reply]
There is no broken claim. The space usage of this variant is stated clearly and correctly as part of the description of the variant. Just because it is less space-efficient than the usual recursive variant does not make it broken, and does not make it an invalid variant of the algorithm. —David Eppstein (talk) 18:40, 24 June 2017 (UTC)[reply]
So, can anyone add clarification why exactly pushing to stack and then checking if node is visited is so important here that it is explicitly stated as a difference? Doing so and not the other way around looks like a waste of memory for me. — Preceding unsigned comment added by 195.64.208.216 (talk) 19:43, 1 April 2019 (UTC)[reply]

Needs citation?[edit]

I noticed that there is a phrase in Vertex ordering - it is natural to consider this code in the order A B C D or A C B D, but not natural to use the order A B D C or A C D B. Is this a claim? I mean how does this sentence add anylokirockz 13:09, 17 May 2014 (UTC) value to the reader? Lokesh Ravindranathan 13:09, 17 May 2014 (UTC)[reply]

ABCD and ACBD are orderings in which you could actually write that code as structured code (ACBD comes from switching which clause is the if and which is the else). The other two orderings would be possible for spaghetti code with gotos, but not as structured code, because the D clause comes between the if and the else. This part of the article is saying that reverse-postorder of the control flow graph matches the ordering that you would normally write the code. The claim that reverse-postorder matches the ordering of structured code probably does need a citation but this specific example doesn't. —David Eppstein (talk) 18:50, 17 February 2015 (UTC)[reply]

Add abbreviation definition?[edit]

Is it possible to add the definitions for |V| and |E| in the Properties section; because of the 'rule': explain abbreviations where they appear? Corresponding paragraph: In theoretical computer science, DFS is typically used to traverse an entire graph, and takes time Θ(|V| + |E|), [...] [1] Averrhoa (talk) 09:30, 16 October 2015 (UTC)[reply]

References

Non-recursive pseudocode from Kleinberg and Tardos?[edit]

The non-recursive pseudocode is unnecessary complicated and it is not from Kleinberg and Tardos as opposed to what is claimed. "and not in stack" is not an operation you can perform on a stack and it is not even needed here. The innermost "if w is not labeled as discovered" is unnecessary too, since this condition is checked anyway in the while loop. At this stage you can simply push the node to the stack. Below is the exact pseudocode from Kleinberg and Tardos, Algorithm Design, page 93. I don't know how to format things here so please someone repair this.

 DFS (s) :
 Initialize S to be a stack with one element s
 While S is not empty
   Take a node u from S
   If Explored[u] = false then
     Set Explored[u] = true
     For each edge (u, v) incident to u
       Add v to the stack S
     Endfor
   Endif
 Endwhile

94.222.108.200 (talk) 17:12, 10 August 2016 (UTC)[reply]

I just found out that the recent edits by 47.19.74.2 on 7 July 2016 and by 2601:246:100:58b2:48b7:c3e6:2773:8af4 on 20 July 2016‎ are responsible for the blunders in the pseudocode. The code should be reverted to the version before that. 94.222.108.200 (talk) 17:35, 10 August 2016 (UTC)[reply]

 Done Thanks for catching this. I believe the "not in stack" test turned this into something different from DFS; see http://11011110.livejournal.com/279880.html on some pitfalls in this area. I have restored the simpler pseudocode. —David Eppstein (talk) 17:47, 10 August 2016 (UTC)[reply]

Lexicographic ?[edit]

There is a big problem with recent edits. They mention lexicographic DFS. It is not defined anywhere on this page, and there is no link to any wikipedia page explaining what LexDFS is. There is a Lexicographic BFS wikipedia article, but it seems that there is no lexBFS article. Therefore, either the paragraph complexity should be removed, or the explanation should be largely expanded. In particular, the sentence «the lexicographic DFS order of a rooted graph» is misleading, I think. Because it means that, given a rooted graph, there is a single lexicographic DFS. This statement would be false for LexBFS, I doubt it would be true for LexDFS. Sadly, I'm not graph savy enough to do the correction myselfArthur MILCHIOR (talk) —Preceding undated comment added 10:21, 27 June 2017 (UTC)[reply]

It is defined immediately after it is used, to mean the DFS you get from the standard recursive algorithm. Do you have a better word for that meaning than "lexicographic"? —David Eppstein (talk) 15:40, 27 June 2017 (UTC)[reply]
With this explanation, I begin to understand. But the sentence seems really strangely phrased to me. May be because english is not my mother tong, and automata theory is not my subject of research. I'll try the following rephrasing, I don't know whether the sentence is more clear, or even mathematically correct.

The computational complexity of DFS was investigated by John Reif. More precisely, given a graph , let be the ordering computed by the standard recursive DFS algorithm. He considered the complexity of computing given . A decision version of the problem (testing whether some vertex u occurs before some vertex v in ) is P-complete[9]. It means that it is "a nightmare for parallel processing".[10]:189

Note that is a precise DFS. However, many other DFS exists. They can be obtained, using the standard recursive algoirthm on , obtained from by reordering the outgoing edges of each vertex. An arbitrary DFS can be computed by a randomized parallel algorithm in the complexity class RNC.[11] As of 1997, it remained unknown whether a depth-first traversal could be constructed by a deterministic parallel algorithm, in the complexity class NC.[12] — Preceding unsigned comment added by Arthur MILCHIOR (talkcontribs) 23:38, 27 June 2017 (UTC)[reply]

Most of this looks ok, but I don't like your sentence "An arbitrary DFS can be computed". It is too ambiguous about who makes the arbitrary choice of which traversal ordering is to be computed: is it part of the input to the algorithm (no) or something the algorithm itself chooses (yes). Also, using "DFS" to refer to the traversal ordering, rather than to the search algorithm, is perverse, and in any case we should avoid unnecessary acronyms because they make our articles more jargony and by doing that they violate WP:TECHNICAL. Finally, I checked the relevant literature and they do indeed consistently use "lexicographic" to mean the traversal that you get by the standard algorithm. It is the traversal whose edge sequence is lexicographically first, based on a given ordering of the edges of the graph. There are also some additional papers that can be cited here; for instance a deterministic parallel lexicographic DFS is known for some special classes of graphs (including directed acyclic graphs). —David Eppstein (talk) 00:49, 28 June 2017 (UTC)[reply]
In case it was not clear, I am totally ok with the word lexicographic. My point was that, if one use this word in wikipedia, I think that one should state a precise and clear definition of this word. By this, I mean, at least a subsection named «lexicographic ordering». Or «A lexicographic Depth-First Search ordering is : .... ». I didn't see anything like that, and, when I tried to find the definition, I sincerely had trouble finding it. You told there were a definition below. I assure you that it was not clear to me that it was a definition. Therefore, I fear some other reader will have the same trouble. I don't know what is the «relevant literature», so I think I'll let you do the edit to add the definition in wp.
I do approve all of your remarks concerning my paragraph. Is it ok for you I copy this paragraph from Discussion to Page, without using DFS but Depth-first search (or Depth-first search ordering if required (my mistake)), replacing the word «arbitrary» by «A Depth-First Search ordering, not necessarily , can be computed by a randomized parallel algorithm in the complexity class RNC.» — Preceding unsigned comment added by Arthur MILCHIOR (talkcontribs) 11:01, 28 June 2017 (UTC)[reply]
Go ahead, we can always copyedit it more later. —David Eppstein (talk) 15:40, 28 June 2017 (UTC)[reply]

SLD resolution[edit]

@David Eppstein: I noticed that you removed one of the examples that I added here because it was "too specialized." Would this example be suitable for inclusion elsewhere in this article? Jarble (talk) 00:57, 11 July 2017 (UTC)[reply]

Maybe, even in the same list of applications. But it's a bad choice for the top of the list, because we should start with the simple and generic applications (currently at the top) and then add more specialized ones. —David Eppstein (talk) 01:03, 11 July 2017 (UTC)[reply]
@David Eppstein: I was misled by Wikipedia's description of backtracking algorithms, which incorrectly conflates backtracking with depth-first search. Jarble (talk) 17:54, 12 July 2018 (UTC)[reply]

Iterative version: why pushing visited nodes?[edit]

It's unclear to me why the iterative algorithm shown here has this difference with BFS, noted after the algorithm, that the test for having been visited is done after pop rather than before push. This does not seem to be intrinsically needed by the algorithm. DFS and BFS should only differ by the underlying structure (stack vs queue). --24.148.56.233 (talk) 01:03, 26 October 2019 (UTC)[reply]

As already stated above, please see https://11011110.github.io/blog/2013/12/17/stack-based-graph-traversal.htmlDavid Eppstein (talk) 02:10, 26 October 2019 (UTC)[reply]
Sorry, not sure how I missed that the conversation was already going on elsewhere. I'd add my voice to the people that argue that the version on Wikipedia should have the property that dfs and bfs only differ by the queue vs stack property (using what you call dfs2 on 222). --24.148.56.233 (talk) 02:56, 27 October 2019 (UTC)[reply]