Discussion:
SSSD starts, then stops
(too old to reply)
Christian Tardif
2015-01-11 23:11:32 UTC
Permalink
Hi,

I have SSSD installed and setup as I did numerous times. But this times,
it refuses to work correctly.

The domain is a samba 4.1.14 domain with rfc2307 enabled (and users
provisionned accordingly). SSSD has been ser as:

========================================================
[domain/THEDOMAIN]
id_provider = ldap
auth_provider = ldap
chpass_provider = ldap
access_provider = simple
ldap_uri = ldap://THESERVERDOMAIN/
ldap_search_base = dc=THEDOMAIN,dc=THESUFFIX
ldap_default_bind_dn = cn=ldap,cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_default_authtok = *********************
ldap_default_authtok_type = password
ldap_user_object_class = user
ldap_user_search_base = cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_group_object_class = group
ldap_group_search_base = cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_id_mapping = false
#ldap_schema = ad
ldap_schema = rfc2307bis
ldap_tls_reqcert = never
ldap_id_use_start_tls = false
ldap_network_timeout = 6
override_gid = 100
enumerate = true
cache_credentials = true
cache_sensitive = false
entry_cache_timeout = 600
debug_level = 9

[sssd]
services = nss, pam
config_file_version = 2
domains = THEDOMAIN
debug_level = 9

[nss]
filter_users = root,named,avahi,haldaemon,dbus,radiusd,news,nscd
override_homedir = /home/%u
default_shell = /bin/bash
debug_level = 9

[pam]

[sudo]

[autofs]

[ssh]
========================================================

When I start sssd, the users from the domain appears for a few seconds,
then disappear, corresponding, obviously, with the moment that sssd
dies. From the logs, there's nothing, from what I can understand, that
leads to a solution to fix this.

Can someone helps be with that?

Thanks,

--------------------------------------------------------------------------------

Christian
Lukas Slebodnik
2015-01-12 07:52:24 UTC
Permalink
Hi,
I have SSSD installed and setup as I did numerous times. But this times, it
refuses to work correctly.
The domain is a samba 4.1.14 domain with rfc2307 enabled (and users
========================================================
[domain/THEDOMAIN]
id_provider = ldap
auth_provider = ldap
chpass_provider = ldap
access_provider = simple
ldap_uri = ldap://THESERVERDOMAIN/
ldap_search_base = dc=THEDOMAIN,dc=THESUFFIX
ldap_default_bind_dn = cn=ldap,cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_default_authtok = *********************
ldap_default_authtok_type = password
ldap_user_object_class = user
ldap_user_search_base = cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_group_object_class = group
ldap_group_search_base = cn=users,dc=THEDOMAIN,dc=THESUFFIX
ldap_id_mapping = false
#ldap_schema = ad
ldap_schema = rfc2307bis
ldap_tls_reqcert = never
ldap_id_use_start_tls = false
ldap_network_timeout = 6
override_gid = 100
enumerate = true
cache_credentials = true
cache_sensitive = false
entry_cache_timeout = 600
debug_level = 9
[sssd]
services = nss, pam
config_file_version = 2
domains = THEDOMAIN
debug_level = 9
[nss]
filter_users = root,named,avahi,haldaemon,dbus,radiusd,news,nscd
override_homedir = /home/%u
default_shell = /bin/bash
debug_level = 9
[pam]
[sudo]
[autofs]
[ssh]
========================================================
When I start sssd, the users from the domain appears for a few seconds, then
disappear, corresponding, obviously, with the moment that sssd dies.
Did sssd crash?
Fron the logs, there's nothing, from what I can understand, that leads to a
solution to fix this.
Can someone helps be with that?
I can see that you have enabled verbose debugging in sssd.
You can try to find the most critical messages with a following grep command.
grep -E "\(0x00[1-9]0\)" sssd_xyz.log

If you do not find anything suspicious please send log files.

Which version of sssd do you use?

LS
Lukas Slebodnik
2015-01-12 13:58:32 UTC
Permalink
It looks like it existed gracefully, from my sssd.log
I'm using sssd version 1.11.6
I'm attaching all (rather big) logs
From sssd.log:
(Mon Jan 12 08:32:56 2015) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching.
(Mon Jan 12 08:32:56 2015) [sssd] [mt_svc_exit_handler] (0x0040): Child [SERVINFO] terminated with signal [6]

signal 6 is SIGABRT.

Could you provide coredump?

If you are using CentOS or Fedora then the simplest way is to use abrt.

LS
Christian Tardif
2015-01-13 03:33:37 UTC
Permalink
One thing I've noticed tonight, while doing some more tests, is that
sssd doesn't crash anymore when I'm not setting enumerate to true. I'm
still not able to login, but I'll be working on this one. I'm making
some progress....
--------------------------------------------------------------------------------

Christian Tardif
***@servinfo.ca





------ Message d'origine ------
De : "Lukas Slebodnik" <***@redhat.com>
À : "Christian Tardif" <***@servinfo.ca>
Cc: "End-user discussions about the System Security Services Daemon"
<sssd-***@lists.fedorahosted.org>
Envoyé 2015-01-12 08:58:32
Objet : Re: [SSSD-users] SSSD starts, then stops
Post by Lukas Slebodnik
It looks like it existed gracefully, from my sssd.log
I'm using sssd version 1.11.6
I'm attaching all (rather big) logs
(Mon Jan 12 08:32:56 2015) [sssd] [sbus_dispatch] (0x0080): Connection
is not open for dispatching.
(Mon Jan 12 08:32:56 2015) [sssd] [mt_svc_exit_handler] (0x0040): Child
[SERVINFO] terminated with signal [6]
signal 6 is SIGABRT.
Could you provide coredump?
If you are using CentOS or Fedora then the simplest way is to use abrt.
LS
Christian Tardif
2015-01-13 03:43:17 UTC
Permalink
OK, now I can login. I was using pam_listfile.so module, but the
required group to allow login did not have required posix gid to be
available in the linux box. Now it has.

So my main problem is the unability to use enumerate=true. Not
necessarily a big deal, but maybe worth verifying why, though.
--------------------------------------------------------------------------------

Christian Tardif
***@servinfo.ca





------ Message d'origine ------
De : "Christian Tardif" <***@servinfo.ca>
À : sssd-***@lists.fedorahosted.org; "Lukas Slebodnik"
<***@redhat.com>
Envoyé 2015-01-12 22:33:37
Objet : Re: [SSSD-users] SSSD starts, then stops
Post by Christian Tardif
One thing I've noticed tonight, while doing some more tests, is that
sssd doesn't crash anymore when I'm not setting enumerate to true. I'm
still not able to login, but I'll be working on this one. I'm making
some progress....
--------------------------------------------------------------------------------
Christian Tardif
------ Message d'origine ------
Cc: "End-user discussions about the System Security Services Daemon"
Envoyé 2015-01-12 08:58:32
Objet : Re: [SSSD-users] SSSD starts, then stops
Post by Lukas Slebodnik
It looks like it existed gracefully, from my sssd.log
I'm using sssd version 1.11.6
I'm attaching all (rather big) logs
(Mon Jan 12 08:32:56 2015) [sssd] [sbus_dispatch] (0x0080): Connection
is not open for dispatching.
Child [SERVINFO] terminated with signal [6]
signal 6 is SIGABRT.
Could you provide coredump?
If you are using CentOS or Fedora then the simplest way is to use
abrt.
LS
_______________________________________________
sssd-users mailing list
https://lists.fedorahosted.org/mailman/listinfo/sssd-users
Lukas Slebodnik
2015-01-13 07:58:53 UTC
Permalink
OK, now I can login. I was using pam_listfile.so module, but the required
group to allow login did not have required posix gid to be available in the
linux box. Now it has.
So my main problem is the unability to use enumerate=true. Not necessarily a
big deal, but maybe worth verifying why, though.
I looked to the log file one more time and
I found that crash happend just with enumerating services.

It might be caused by fact that different LDAP connection tried to be used for
services.

[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://orion.int.servinfo.test:389/??base] with fd [19].
[sdap_get_rootdse_send] (0x4000): Getting rootdse

//snip

[sdap_get_services_next_base] (0x0400): Searching for services with base [dc=servinfo,dc=test]
[sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectclass=ipService)(cn=*)(ipServicePort=*)(ipServiceProtocol=*))

[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServicePort]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServiceProtocol]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged]
[sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 5
[sdap_process_result] (0x2000): Trace: sh[0x256a080], connected[1], ops[0x256b430], ldap[0x256a190]
[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://servinfo.test/CN=Configuration,DC=servinfo,DC=test] with fd [21]

//after few lines

[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
[remove_connection_callback] (0x4000): Successfully removed connection callback.
[server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^^^^^^^^^
process was restarted


I can see in log file that just 1st LDAP server should be used.
[dp_get_options] (0x0400): Option ldap_uri has value ldap://orion.int.servinfo.test/


I may be wrong but it may be caused by LDAP referrals.

You can try to disable it in sssd.
Put next line into domain section of sssd.conf

ldap_referrals = false

LS
Jakub Hrozek
2015-01-13 08:48:21 UTC
Permalink
Post by Lukas Slebodnik
OK, now I can login. I was using pam_listfile.so module, but the required
group to allow login did not have required posix gid to be available in the
linux box. Now it has.
So my main problem is the unability to use enumerate=true. Not necessarily a
big deal, but maybe worth verifying why, though.
I looked to the log file one more time and
I found that crash happend just with enumerating services.
It might be caused by fact that different LDAP connection tried to be used for
services.
[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://orion.int.servinfo.test:389/??base] with fd [19].
[sdap_get_rootdse_send] (0x4000): Getting rootdse
//snip
[sdap_get_services_next_base] (0x0400): Searching for services with base [dc=servinfo,dc=test]
[sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectclass=ipService)(cn=*)(ipServicePort=*)(ipServiceProtocol=*))
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServicePort]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServiceProtocol]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged]
[sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 5
[sdap_process_result] (0x2000): Trace: sh[0x256a080], connected[1], ops[0x256b430], ldap[0x256a190]
[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://servinfo.test/CN=Configuration,DC=servinfo,DC=test] with fd [21]
//after few lines
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
[remove_connection_callback] (0x4000): Successfully removed connection callback.
A core file would help us more here, but I suspect a reconnection caused
some internal structure that was allocated on the connection object to
be released, but then it was reused..

Which sssd version is this? IIRC Sumit patched a similar situation a
couple of months ago.
Post by Lukas Slebodnik
[server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^^^^^^^^^
process was restarted
I can see in log file that just 1st LDAP server should be used.
[dp_get_options] (0x0400): Option ldap_uri has value ldap://orion.int.servinfo.test/
I may be wrong but it may be caused by LDAP referrals.
You can grep the logs for sdap_rebind_proc to be sure.

btw I didn't let the logs through to the list, they were a bit too big
for everyone's mailbox :-)
Post by Lukas Slebodnik
You can try to disable it in sssd.
Put next line into domain section of sssd.conf
ldap_referrals = false
LS
_______________________________________________
sssd-users mailing list
https://lists.fedorahosted.org/mailman/listinfo/sssd-users
Lukas Slebodnik
2015-01-15 08:41:09 UTC
Permalink
Post by Lukas Slebodnik
OK, now I can login. I was using pam_listfile.so module, but the required
group to allow login did not have required posix gid to be available in the
linux box. Now it has.
So my main problem is the unability to use enumerate=true. Not necessarily a
big deal, but maybe worth verifying why, though.
I looked to the log file one more time and
I found that crash happend just with enumerating services.
It might be caused by fact that different LDAP connection tried to be used for
services.
[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://orion.int.servinfo.test:389/??base] with fd [19].
[sdap_get_rootdse_send] (0x4000): Getting rootdse
//snip
[sdap_get_services_next_base] (0x0400): Searching for services with base [dc=servinfo,dc=test]
[sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectclass=ipService)(cn=*)(ipServicePort=*)(ipServiceProtocol=*))
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServicePort]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServiceProtocol]
[sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged]
[sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 5
[sdap_process_result] (0x2000): Trace: sh[0x256a080], connected[1], ops[0x256b430], ldap[0x256a190]
[sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://servinfo.test/CN=Configuration,DC=servinfo,DC=test] with fd [21]
//after few lines
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
[remove_connection_callback] (0x4000): Successfully removed connection callback.
[server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^^^^^^^^^
process was restarted
I can see in log file that just 1st LDAP server should be used.
[dp_get_options] (0x0400): Option ldap_uri has value ldap://orion.int.servinfo.test/
I may be wrong but it may be caused by LDAP referrals.
You can try to disable it in sssd.
Put next line into domain section of sssd.conf
ldap_referrals = false
Cristian,

dit it help to disable referrals?

LS
Christian Tardif
2015-01-19 15:31:09 UTC
Permalink
Non it didn't. I'm getting the same issue with or without enabling
referrals. The only way to keep the sssd daemon up has been, so far, to
disable enumeration (enumerate = false) in the domain config.

---

Christian Tardif
***@servinfo.ca

-------------------------
Requesting attrs: [objectClass] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServicePort] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipServiceProtocol] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 5 [sdap_process_result] (0x2000): Trace: sh[0x256a080], connected[1], ops[0x256b430], ldap[0x256a190] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://servinfo.test/CN=Configuration,DC=servinfo,DC=test] with fd [21] //after few lines [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] [remove_connection_callback] (0x4000): Successfully removed connection callback. [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb ^^^^^^^^^^^^^ process was restarted I can see in log file that just 1st LDAP server should be used. [dp_get_options] (0x0400): Opti
on
ldap_uri has value ldap://orion.int.servinfo.test/ I may be wrong but it may be caused by LDAP referrals. You can try to disable it in sssd. Put next line into domain section of sssd.conf ldap_referrals = false

Cristian,

dit it help to disable referrals?

LS

Continue reading on narkive:
Loading...