187
Views
0
CrossRef citations to date
0
Altmetric
Original Article

Results on the Wiener profileFootnote

&
Pages 53-62 | Received 23 Jan 2017, Accepted 02 Jan 2018, Published online: 10 Jun 2020

1 Introduction

Let G=(V,E) be a simple connected graph (without loops and multiple edges). The distance d(u,v) of two vertices u,vV is the number of edges in a shortest path between u and v. The Wiener index w(G) of G is the sum of these distances for all distinct pairs d(u,v). This concept was introduced by Harry Wiener, who showed that this graph parameter is closely correlated with some chemical parameters, for instance with the boiling points of alkane molecules [Citation1]. As an example, the Wiener indices of a large classes of fullerenes are determined in [Citation2]. The book [Citation3] contains many analogues and generalizations of the Wiener index, all of them have relevance in chemistry.

Let us introduce a new vector-parameter, the Wiener profile as the n1-dimensional vector (f1,f2,,fn1) where fk is the number of pairs of distinct vertices (u,v) with distance d(u,v)=k. It is easy to see that the Wiener index is equal to w(G)=i=1n1ifi.The contribution of a pair (u,v) of vertices (atoms in a molecule) to the Wiener index is i if their distance is d(u,v)=i. However, if we want to have a parameter which indicates a certain chemical property then the contribution of a pair (u,v) could be different from their distance. Suppose that it is some αi in case when d(u,v)=i. Here we supposed that the contribution of the pair (u,v) to the property depends only on their distance. But the choice of the numbers αi might depend on the chemical property in question. Let α=(α1,α2,,αn1) be the vector of these coefficients. Then the weighed Wiener index with these coefficients is wα=i=1n1αifi.The choice α=(1,2,,n1) gives back the Wiener index.

We will study the structure of the set of Wiener profiles of graphs on n vertices. Suppose e.g. that we want to find the molecule containing n atoms and having a certain chemical property the most or least, then we have to maximize or minimize wα under the condition that (f1,f2,,fn1) is the Wiener profile of a graph of n vertices (α is fixed!). From the chemical point of view it makes sense only to consider connected graphs, but technically it is easier to handle the set of all graphs on n vertices. If a graph is not connected disregard the pairs in different components. Take all Wiener profiles of the graphs on at most n vertices (in other words the subgraphs of the n-vertex complete graph Kn.) Denote this set by Wn. It is a set of points with integer coordinates in the n1-dimensional Euclidean space.

Let us note that profile vectors were also studied in Extremal Set Theory (see e.g. [Citation4]). There is, however a huge difference between the structures of the sets of profile vectors there and here. In Extremal Set Theory, if the coordinates of a profile vectors are decreased (remaining in the non-negative range) then the new vector is also a profile vector. This is not true here at all, as the n=3 shows. There are 4 non-isomorphic graphs on three vertices: the complete graph K3, the path of two edges, a single edge and the empty graph. Their profiles in this order are (3,3),(2,1),(1,0),(0,0). That is, W3={(3,3),(2,1),(1,0),(0,0)}. One can see that (3,2) is not a profile though (3,3) is.

Our main task is to give necessary and/or sufficient conditions on vectors to make them Wiener profiles. There is one trivial condition. Since fi is counting the number of distances i, their sum counts the total number of distances. i=1n1fi=n2.But there are more complex relations among the coordinates fi. We will introduce them below. However, before that, we suggest a new invariant of graphs that intuitively also has significance in chemistry. In the Wiener index and in the Wiener profile the pairs of vertices with distance k are considered. The distance is defined by a shortest path between the two vertices. But there might be more paths of the same (shortest) length between the vertices. The molecule is probably more rigid if the vertices are connected with more paths. This motivation makes us to introduce the concept of the path profile. Let pk be the number of paths of length k in the graph G containing n vertices (1k<n). The path profile is the vector (p1,p2,,pn1). The path index is then π(G)=i=1n1ipi.

Here we will call the attention on some more complex necessary conditions that pk’s and fk’s must satisfy. Suppose that pk is known, fixed. One can intuitively feel that pk1 cannot be too small. Indeed, the graph contains pk paths of length k, these paths contain many paths of length k1. This fact must give a lower bound on the number pk1. Let us use a somewhat different approach, considering the set E of edges. The pk paths are its k-element subsets. Deleting the first or last edge from such a path, a path of length k1 is obtained. What is the minimum number of these k1-element subsets? The literature knows a very closely related problem/results. Let A be a family of k-element sets. The shadow σ(A) is a family of those k1-element sets which are obtained from the members of A deleting one (arbitrary) element. The Shadow Theorem [Citation5,Citation6] determines the exact minimum of |σ(A)| for given k and |A| (where |S| denotes the size of the set S). We will study an analogous problem here.

The Path Shadow Problem. Knowing the number pk of paths of length k in a graph, determine (or estimate) the minimum number of subpaths of length k1 (or in general of length ). There are substantial differences between this problem and the traditional Shadow Problem. Here the k-element subsets of edges are not necessarily paths of length k, in contrast to the case of the Shadow Problem. On the other hand, in the new problem we cannot delete edges arbitrarily from a path obtaining a shorter path.

The Distance Shadow Problem is formally very similar. Given fk, the number of pairs of distinct vertices with distance k, determine the minimum number of distances k1 in a graph. Our present paper is devoted to these two problems.

2 Minimization of path and distance shadows

Given a graph G let fk=fk(G) be the number of pairs of distinct vertices (u,v) with distance d(u,v)=k. Our goal in this section is to find the minimum of f(G) for all graphs containing fk pairs with distance k where k>. Denote the minimum by M(fk,k,). If n, the number of vertices is known then we can define the minimum M(fk,n,k,). These quantities are the minimum sizes of the distance shadow. The following inequality is obvious. (1) M(fk,k,)M(fk,n,k,).(1)

But there is an another variant of this problem. The distance d(u,v) is defined by a (shortest) path. In our modified problem we consider all paths of length k, not only the ones which define the distance. Let N(fk,k,) denote the minimum number of paths of length in a graph containing exactly fk paths of length k. Similarly, let N(fk,n,k,) denote the same under the condition that the number of vertices of the graph is fixed: n. The following inequality is obvious, again. (2) N(fk,k,)N(fk,n,k,).(2) These are the minimum sizes of the path shadows.

2.1 The case =1, paths

The results in this subsection are taken from the literature. We list them for the sake of completeness.

Here the number of distances 1 and the number of paths of length 1, that is the number of edges should be minimized in both cases. It is easier to handle the following inverse problem. The number of edges in the graph is given, determine the maximum number of distances k or paths of length k in the graph, either fixing the number of vertices or not.

Let the inverse of N(fk,k,1) be R(e,k). This is the maximum number of paths of length k in a graph containing e edges. Similarly, let R(n,e,k) denote the same under the condition that not only the number of edges, but also the number of vertices is given: e and n, respectively. It is easy to determine R(e,2).

Proposition 1

R(e,2)=e2.

Proof

The star (one “central” vertex contained in all e edges) gives the above value, proving R(e,2)e2. On the other hand a path of length two consists of two edges, therefore one cannot have more of them than e2.  □

It is more difficult to determine R(n,e,2). Paper [Citation7] contains an almost complete solution. In order to be able to formulate its main statement some definitions are needed. A quasi-clique consists of a clique (complete graph) and an additional vertex adjacent to a subset of the vertices of the clique. (If there is no additional vertex then the quasi-clique is simply a clique.) A quasi-star is the complement of a quasi-clique.

Theorem 1

[Citation7]

R(n,e,2) is equal to the number of paths of length two either in a quasi-star or in a quasi-clique with e edges. Moreover the extremal graph is a quasi-star if e12n2n2 and a quasi-clique if e12n2+n2.

One can see from this special case that the determination of R(e,k) is easier than that of R(n,e,k). However in the case of k=3 the difference is not so much.

Theorem 2

[Citation8]

If 10a2e<a+12 holds then R(e,3)2e(ea)a2a with equality for the complete graph on a vertices.

Here we do not have the exact value for all e, but this theorem gives a very good asymptotic estimate for R(e,3). Moreover, the best (asymptotic) construction does not depend on n. Therefore the asymptotic solution (that is sharp for infinitely many e) gives the asymptotically sharp solution for R(n,e,3), too.

Based on these results one can guess that the odd and even k’s behave differently.

Theorem 3

[Citation9], see also [Citation8]

If k is odd then both R(n,e,k) and R(e,k) are asymptotically equal to 2k12ek+12 and the asymptotically sharp construction is a complete graph.

Theorem 4

[Citation8]

If k is even then R(e,k) is asymptotically equal to ckek2+1, where ck is a constant depending only on k.

If k=4, then R(e,4) is basically known.

Theorem 5

[Citation10]

If e is large enough then R(e,4)=e383e24+eifeiseven,e387e28+15e898ifeisodd.The extremal construction is a complete bipartite graph with sizes e2 and 2.

Very little is known on R(n,e,k), but R(n,e,4) is asymptotically determined.

Theorem 6

[Citation11]

R(n,e,4) is asymptotically equal to the number of paths either in a quasi-star or in a quasi-click.

2.2 The case =1, distances

We consider the inverse again. The inverse of M(fk,k,1) is S(e,k): this the maximum number of distances k in a graph with e edges, while S(n,e,k) is the maximum number of distances k in a graph with n vertices and e edges. Of course (3) S(e,k)R(e,k) and S(n,e,k)R(n,e,k)(3) holds. The inequalities are usually very sharp, since, e.g. the number of distances is at most quadratic in n, while the number of paths can be much larger.

Proposition 2

S(e,2)=e2.

Proof

The statement follows from Proposition 1, (Equation3) and the fact that the star with e rays contains this many distances 2.  □

Proposition 3

S(n,e,2)=e2 if en1,n2e if ne.

Proof

If en1 then one cannot have more than e2 paths of length 2, therefore this is an upper bound for the number of distances, as well. The star of e edges gives equality. On the other hand, if ne then the distance between the endpoints of an edge is one, cannot be two. Hence the total number of distances two cannot be more than n2e. The star with n1 rays and e(n1) other edges gives the equality.  □

Theorem 7

For fixed k and large e S(e,k) is asymptotically equal to e2.

Proof

Let us see that the number of finite distances in a graph with e edges cannot exceed e+12. If the graph is connected then the number of vertices is at most e+1, therefore the number of distances is at most e+12. Suppose now that the numbers of edges in the r components are e1,e2,,er where e1+e2++er=e. The i’s component has at most ei+1 vertices therefore at most ei+12 edges. The total number edges is at most i=1rei+12.This is really at most e+12 as the following lines show: i=1rei+12=12i=1rei2+12i=1rei12i=1rei2+12i=1rei= i=1rei+12=e+12.

Since e+12 and e2 are asymptotically equal, the latter one is really an asymptotical upper bound.

Below we will construct asymptotically optimal graphs, distinguishing cases according to k.

Suppose first that k is even. Let T(t,k2) be a complete t-ary tree of depth k. This is a tree with a root of degree t, all other vertices have degrees t+1 and 1. The latter one are the leaves. The distance of the root and a leaf is k2. The number of edges is (4) e=t+t2++tk2=ttk21t1tk2(4) where it is supposed in the asymptotical calculation that k is fixed and t is large. Denote the vertices of distance 1 from the root by v1,,vt, while the set of leaves connected to the root through vi is denoted by Li. It is easy to see that the distance of the vertices aLi and bLj(ij) in the tree T(t,k2) is exactly k. Of course |Li|=tk21. Hence the number of pairs of vertices with distance k is at least (in fact exactly) t2(tk21)2tk2.Using (Equation4) we really have that the number of distances k is asymptotically e22e2 if k is fixed and t is large. This proves the statement for even k.

Suppose now that k>3 is odd. Define a graph G(t,k) in the following way. It will consist of a complete graph Kt and t copies of the rooted tree T(t,(k1)2), one copy attached to every vertex of Kt. The set of leaves of the copy attached to the ith vertex is denoted by Li. The number of edges of G(t,k) is (5) t2+t(t++t(k1)2)t(k+1)2.(5) The distance of the vertices aLi and bLj(ij) in G(t,k) is exactly k. Of course |Li|=t(k1)2. Hence the number of pairs of vertices with distance k is at least (in fact exactly) t2(t(k1)2)2tk+12.Using (Equation5) we obtain that the number of distances is e22e2, as stated.

In the case of k=3 the construction above should be slightly modified. Here we attach a star with t2 edges to each vertex of the complete graph Kt. The endpoints of different stars have distance 3. The number of edges is (6) e=t2+tt2t3.(6) The number of pairs with distance 3 is t2t2t2t62e22,where (Equation6) was used.

Our constructions show the statement for an infinite sequence of values of e, namely for the values of form (Equation(4)Equation(6)) in the respective cases. This would be sufficient if we knew that S(e,k) is a monotone function of e. However it is absolutely non-trivial that adding an edge to a graph increases the distances k. Here, however only the monotonicity of our construction is needed.

Suppose that k is even and t+t2++tk2<e<(t+1)+(t+1)2++(t+1)k2.Add e(t+t2++tk2)edges to T(t,k2) without forming a cycle. No distance k can be destroyed in this way in T(t,k2). Therefore the number of distances k in this new graph is at least as much as in T(t,k2). This proves that the number of distances k can asymptotically be tk2 when e is asymptotically at most (t+1)k2tk2.

The other cases of k can be similarly settled.  □

If k is very close to e then there is a hope to get the exact value of S(e,k).

Conjecture 1

If 0 is fixed, e is large enough then S(e,e)=22.

The construction giving this value is two stars with 2 and 2 edges connected with a path of e edges. One easily prove the conjecture for =0,1,2,3.

2.3 The case >1, paths

Here we consider the inverse problem, again. Let U(p,,k)(<k) be the largest number of paths of length k in a graph containing p paths of length . The most important case is =k1. Having a good upper estimate on U(p,k1,k) (and U(p,k2,k1)) we get one for U(p,k2,k), and so on. Make our notation shorter: U(p,k)=U(p,k1,k).

Proposition 4

U(p,k) is asymptotically lower bounded by 21kpk+1k.

Proof

The complete graph Kn serves as a construction. The number of paths of length k1 is p=n(n1)(nk+1)2 (choosing an ordered sequence of vertices of length k, we have to divide the product by 2 because every path is counted starting from both ends). Hence pnk2 that is n(2p)1k. The number of paths of length k is n(n1)(nk)2nk+12(2p)k+1k2=21kpk+1k.  □

Observe that this upper bound is too weak if k=2. It gives 2p32 while we know by Proposition 1 that U(p,2)=R(e,2)=e2=p2. However we believe that the bound is asymptotically sharp starting from k=3.

Conjecture 2

U(p,3)213p43.

Theorem 8

Conjecture 1 is true for regular graphs.

Proof

Let n denote the number of vertices of the regular graph of degree d. The number of paths of length 2 with middle vertex at a given vertex is d2. The total number of paths of length 2 is (7) p=nd2.(7) The number of paths of length 3 with a fixed edge in the middle is (d1)2, since we have (d1) ways to continue the path at each end of the fixed edge. The total number of paths of length 3 is obtained by multiplying this with the number of edges. The number of edges is nd2. Hence the number of paths of length 3 is (8) nd2(d1)2.(8) We know by Proposition 4 that 213p43 is an asymptotic lower bound. Here we will prove that it is also an upper bound for (Equation8) that is for the number of paths of length 3: (9) nd2(d1)2213p43.(9) Use (Equation7) on the right hand side. nd2(d1)2213p43=213nd243.Writing out the details this becomes nd(d1)2n43d43(d1)43.Divide both sides by nd(d1)43. We have arrived to (d1)23n13d13.This is true, since the degree cannot exceed the number of vertices. (Equation9) and the theorem are proved.  □

2.4 The case >1, distances

Let T(p,,k)(<k) be the largest number of distances of length k in a graph containing p distances of length . The most important case is =k1. Having a good upper estimate on T(p,k1,k) (and T(p,k2,k1)) we get one for T(p,k2,k), and so on. Make our notation shorter: T(p,k)=T(p,k1,k).

Of course T(p,2)=S(p,2) and Proposition 2 gives the exact solution. The number of distances 2 is a quadratic function of the number of distances 1 (edges). We believe that this cannot happen for larger k’s. Actually, it is difficult to construct a graph in which the number of distances 3 is much larger than the number of distances 2.

Theorem 9

136(1o(1))plogpT(p,3)where log means logarithm of basis 2.

The proof will be divided into lemmas.

Let the vertices of a graph Gd be the sequences of length d formed from the elements 3,2,1,0,1,2,3. Two vertices are adjacent if the corresponding sequences are equal in all but one place where the difference of the values is 1 or 1. This is actually a 7×7××7 “grid” containing 7d vertices. Let n(d,t) denote the number of vertices having distance t from the origin (0,0,,0). The case t=1 is trivial. (10) n(d,1)=2d.(10)

Lemma 1

n(d,2)=2d2.

Proof

Let H denote the set of vertices of distance 2 from the origin in G(d+1). Divide H into subsets according to their first coordinates: H(2),H(1),H(0),H(1),H(2). Then (11) |H|=|H(2)|+|H(1)|+|H(0)|+|H(1)|+|H(2)|.(11)

The elements of H(0) have distance 2 in the rest, that is in Gd. Hence we have (12) |H(0)|=n(d,2).(12) The elements of H(1) have a value 1 in the first coordinate. Therefore their rest have distance 1 from the origin. Hence we have (13) |H(1)|=n(d,1).(13) The same holds for H(1): (14) |H(1)|=n(d,1).(14) Finally, if the first coordinate is 2 or 2 then the other d coordinates must be all 0. Hence we have (15) |H(2)|=|H(2)|=1.(15) Eqs. (10)–(15) give (16) n(d+1,2)=n(d,2)+4d+2.(16) Now use induction on d to prove the statement of the lemma. For d=1, it is trivial that there are two vertices of distance 2 from the point 0. Suppose now that the statement holds for d and prove it for d+1. By (Equation16) and the inductional hypothesis we can write n(d+1,2)=2d2+4d+2=2(d+1)2.

Lemma 2

n(d,3)=43d3+23d.

Proof

The logic of the previous proof will be used again. Let H denote the set of vertices of distance 3 from the origin in G(d+1). Divide H into subsets according to their first coordinates: H(3),H(2),H(1),H(0),H(1),H(2),H(3). Then (17) |H|=|H(3)|+|H(2)|+|H(1)|+|H(0)|+|H(1)|+|H(2)|+|H(3)|.(17)

The elements of H(0) have distance 3 in the rest, that is in Gd. Hence we have (18) |H(0)|=n(d,3).(18) The elements of H(1) have a value 1 in the first coordinate. Therefore their rest have distance 2 from the origin. Hence we have (19) |H(1)|=n(d,2).(19) The same holds for H(1): (20) |H(1)|=n(d,2).(20) If the first coordinate of a vertex is 2 then the rest must have distance 1 from the origin. This proves (21) |H(2)|=n(d,1)(21) and (22) |H(2)|=n(d,1).(22) Finally, if the first coordinate is 3 or 3 then the other d coordinates must be all 0. Hence we have (23) |H(3)|=|H(3)|=1.(23) Eqs. (17)–(23) give (24) n(d+1,3)=n(d,3)+2n(d,2)+4d+2(24) and, using the statement of Lemma 1, (25) n(d+1,3)=n(d,3)+4d2+4d+2.(25) Now use induction on d to prove the statement of the lemma. For d=1, it is trivial that there are two vertices of distance 3 from the point 0. Suppose now that the statement holds for d and prove it for d+1. By (Equation25) and the inductional hypothesis we can write n(d+1,3)=43d3+23d+4d2+4d+2=43(d+1)3+23(d+1).

Proof of Theorem 9

Modify the graph Gd at the beginning of our proof. Let the vertices of the graph Z8d be the sequences of length d formed from the elements 3,2,1,0,1,2,3,4 and consider these integers modulo 8. Two vertices are adjacent if the corresponding sequences are equal in all but one place where the difference of the values is 1 or 1 (mod 8). This is actually a 8×8××8 “cyclic grid” containing 8d vertices. The 3-neighborhood of a vertex v in a graph is the subgraph spanned by the set of vertices of distance at most 3 from v. Observe that the 3-neighborhoods of the origins in Gd and Z8d are isomorphic. Moreover the 3-neighborhoods of two distinct vertices in Z8d are also isomorphic. Hence it follows that the number of vertices with distance 3 (resp. 2) from a vertex of Z8d is the same as the number of vertices with distance 3 (resp. 2) from the origin of Gd. Summarizing, the number of vertices with distance 3 (resp. 2) from a vertex of Z8d is n(d,3) (resp. n(d,2)). Therefore the total number of distances 2 in Z8d p=128dn(d,2).By Lemma 1 this is equal to (26) p=128d2d2=8dd2.(26) Similarly, the total number of distances 3 in Z8d 128dn(d,3)and by Lemma 2 this is equal to (27) 8d23d3+13d.(27) Here (Equation26) and (Equation27) lead to (28) 8d23d3+13dT(8dd2,3).(28) Eq. (Equation26) gives us plogp=8dd2(3d+2logd).Compare this with the left hand side of (Equation28). One can see that (29) 29(1o(1))plogp=29(1o(1))8dd2(3d+2logd)8d23d3+13d.(29) The inequalities (Equation28) and (Equation29) prove the statement of the theorem in a stronger form, but only for some special values of p, which form an exponential sequence.

Let us prove now the weaker inequality for the intermediate values of p. Suppose (30) p1=8dd2<p<8d+1(d+1)2=p2.(30) Add p8dd2 paths of length 2 to Z8d in such a way that they do not form any cycle neither with Z8d nor with each other. The so obtained graph G contains p pairs with distance 2 and at least (Equation27) pairs with distance 3.

Observe that (31) 18(1o(1))p2logp2=p1logp1.(31) Formulas (Equation30) and (Equation31) imply 136(1o(1))plogp29(1o(1))p1logp1.Combining this with (Equation29) the desired inequality is obtained.  □

Open Problem 1

Find a non-trivial upper bound on T(p,3).

Notes

Peer review under responsibility of Kalasalingam University.

References

  • Wiener H. Structural determination of paraffin boiling points J. Amer. Chem. Soc. 1 69 1947 17 20 10.1021/ja01193a005
  • Hua Hongbo Faghani Morteza Ashrafi Ali Reza The Wiener and Wiener polarity indices of a class of fullerenes with exactly 12n carbon atoms MATCH Commun. Math. Comput. Chem. 71 2014 361 372
  • Janežić Dušanka Miličević Ante Nikolić Sonja Trinajstić Nenad Graph-Theoretical Matrices in Chemistry 2015 CRC Press
  • Erdős Peter L. Frankl P. Katona G.O.H. Extremal hypergraph problems and convex hulls Combinatorica 5 1985 11 26
  • Katona G. A theorem on finte sets Theory of Graphs, Proc. Coll. held at Tihany 1966 Akadémiai Kiadó 187 207
  • Kruskal J.B. The number of simplices in a complex Mathematical Optimization Techniques 1963 University of Calif. Press Berkeley and Los Angeles 251 278
  • Ahlswede R. Katona G.O.H. Graphs with maximal number of adjacent pairs of edges Acta Math. Acad. Sci. Hungar. 32 1978 097 120
  • Bollobás B. Sarkar A. Paths in graphs Studia Sci. Math. Hungar. 38 2001 115 137
  • Alon N. On the number of subgraphs of prescribed type of graphs with a given number of edges Israel J. Math. 38 1981 116 130
  • Bollobás B. Sarkar A. Paths of length four Discrete Math. 265 2003 357 363
  • Nagy Dániel T. On the number of 4-edge paths in graphs with given edge dencity Combin. Probab. Comput. 26 3 2017 431 444